Compare commits

...

946 Commits

Author SHA1 Message Date
Moonlacer
f510f95b71 copyright: Update year and names of many .cpp and .h files 2021-08-10 13:07:49 -05:00
Moonlacer
128bdd127b yuzu-copyright: Update year and names of many .cpp and .h files 2021-08-10 13:00:50 -05:00
Mai M
2da91ec75b Merge pull request #6844 from ameerj/vp9-empty-frame
vp9: Ensure the first frame is complete
2021-08-08 19:02:39 -04:00
bunnei
b9eee1c539 Merge pull request #6843 from FernandoS27/lives-in-a-pineapple-under-the-sea-2
yuzu-cmd/CMakeLists: Correct attribution for this function.
2021-08-08 11:31:47 -07:00
Fernando Sahmkow
23ca1eb82e yuzu-cmd/CMakeLists: Correct attribution for this function. 2021-08-08 20:24:53 +02:00
ameerj
fa22695705 vp9: Ensure the first frame is complete
Silences a runtime error due to the first frame missing the frame data, and being set to hidden despite being a key-frame.
2021-08-08 13:49:00 -04:00
Fernando S
859deda3bb Merge pull request #6834 from K0bin/buffer-image-granularity
Respect Vulkan bufferImageGranularity
2021-08-08 11:57:40 +02:00
bunnei
b023413c98 Merge pull request #6698 from german77/SDL_QoL
input_common: Improve SDL joystick and hide toggle option
2021-08-08 02:44:42 -07:00
bunnei
00358e2098 Merge pull request #6817 from gidoly/patch-1
Add description to fast gpu time option
2021-08-08 01:11:47 -07:00
german77
48b6d41f1b input_common: Improve SDL joystick and hide toggle option 2021-08-07 23:11:23 -05:00
bunnei
63325cafbe Merge pull request #6827 from Morph1984/uuid-hash
common: uuid: Add hash function for UUID
2021-08-07 17:18:46 -07:00
bunnei
bd0e1d3a25 Merge pull request #6830 from ameerj/nvdec-unimpld-codec
nvdec: Better logging for unimplemented codecs
2021-08-07 12:37:39 -07:00
Robin Kertels
bb29dcb7f2 vulkan_memory_allocator: Respect bufferImageGranularity 2021-08-07 15:28:05 +02:00
bunnei
456adb95ff Merge pull request #6795 from sankasan/cmd-remove-cursor-fullscreen
yuzu-cmd: hide mouse cursor when started fullscreen
2021-08-07 02:00:29 -07:00
bunnei
bd1a764827 Merge pull request #6815 from german77/ui_improvements
settings_ui: Add emulated joystick position dot to controller preview
2021-08-06 23:54:23 -07:00
ameerj
928b64d2ce nvdec: Better logging for unimplemented codecs 2021-08-07 01:08:33 -04:00
bunnei
268b5764c7 Merge pull request #6791 from ameerj/astc-opt
astc_decoder: Various performance and memory optimizations
2021-08-06 21:45:24 -07:00
bunnei
f183668a87 Merge pull request #6799 from ameerj/vp9-fixes
nvdec: Fix VP9 reference frame refreshes
2021-08-06 17:46:46 -07:00
ameerj
156ea746a3 nvhost_nvdec_common: Remove BufferMap
This was mainly used to keep track of mapped buffers for later unmapping.  Since unmap is no longer implemented, this no longer seves a valuable purpose.
2021-08-06 20:11:12 -04:00
ameerj
e3688f0627 vp9: Cleanup unused variables
With reference frames refreshes fix, we no longer need to buffer two frames in advance.
We can also remove other unused or otherwise unneeded variables.
2021-08-06 20:08:11 -04:00
ameerj
a3f80a97a3 vp9: Fix reference frame refreshes
This resolves the artifacting when decoding VP9 streams.
2021-08-06 20:08:08 -04:00
ameerj
cc8ac112fc nvhost_nvdec_common: Stub UnmapBuffer Ioctl
Skip unmapping nvdec buffers to avoid breaking the continuity of the VP9 reference frame addresses, and the risk of invalidating data before the async GPU thread is done with it.
2021-08-06 20:06:30 -04:00
bunnei
42d8e08f78 Merge pull request #6822 from yzct12345/clion-assert
assert: Avoid empty macros
2021-08-05 22:29:45 -07:00
Morph
d20c5ac720 common: uuid: Add hash function for UUID
Used when UUID is a key in an unordered_map. The hash is produced by XORing the high and low 64-bits of the UUID together.
2021-08-06 00:41:55 -04:00
gidoly
8ba551e1cd Update configure_graphics_advanced.ui
add description too fast gpu time
2021-08-06 06:08:12 +09:00
bunnei
e1a92db519 Merge pull request #6813 from Morph1984/hex-string-to-uuid
common: uuid: Add hex string to UUID constructor
2021-08-05 13:29:11 -07:00
yzct12345
7e846be376 assert: Verify formatting 2021-08-05 17:46:22 +00:00
yzct12345
346149dcf9 assert: Avoid empty macros 2021-08-05 17:21:56 +00:00
Mai M
a1cb453470 Merge pull request #6819 from Morph1984/i-am-dumb
applet_swkbd: Include the null terminator in the buffer size calculation
2021-08-04 23:32:01 -04:00
Mai M
9a7d2e3659 Merge pull request #6818 from Morph1984/hex-util-bug
hex_util: Fix incorrect array size in AsArray
2021-08-04 23:31:04 -04:00
Morph
f10dc35dd0 applet_swkbd: Include the null terminator in the buffer size calculation
Some games may interpret the read string as a null-terminated string instead of just reading the string up to buffer_size.
2021-08-04 22:32:09 -04:00
Morph
7b39215c8a hex_util: Fix incorrect array size in AsArray
Although this isn't used, this is a potential bug as HexStringToArray will perform an out-of-bounds read.
2021-08-04 22:16:29 -04:00
Morph
edb9c72e26 Merge pull request #6816 from lat9nq/fix-mult-contrl
config: Read connected setting for controllers
2021-08-04 22:05:53 -04:00
lat9nq
be16d92060 config: Read connected setting for controllers
Currently yuzu will read the mapping but does not connect a controller
despite adding subsequent configurations for it. Read the `connected`
setting for now as a boolean like the Qt frontend.
2021-08-04 17:09:35 -04:00
german77
d5bf597436 settings_ui: Use better colors for the light theme 2021-08-04 11:47:06 -05:00
german77
1fb158ce90 settings_ui: Add emulated joystick position dot to controller preview 2021-08-04 11:46:54 -05:00
Morph
705f111058 common: uuid: Add hex string to UUID constructor
This allows for easily converting a hex string into a Common::UUID, which is backed by a 128 bit unsigned integer.
2021-08-04 10:45:41 -04:00
yzct12345
2868d4ba84 nvdec: Implement VA-API hardware video acceleration (#6713)
* nvdec: VA-API

* Verify formatting

* Forgot a semicolon for Windows

* Clarify comment about AV_PIX_FMT_NV12

* Fix assert log spam from missing negation

* vic: Remove forgotten debug code

* Address lioncash's review

* Mention VA-API is Intel/AMD

* Address v1993's review

* Hopefully fix CMakeLists style this time

* vic: Improve cache locality

* vic: Fix off-by-one error

* codec: Async

* codec: Forgot the GetValue()

* nvdec: Address ameerj's review

* codec: Fallback to CPU without VA-API support

* cmake: Address lat9nq's review

* cmake: Make VA-API optional

* vaapi: Multiple GPU

* Apply suggestions from code review

Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>

* nvdec: Address ameerj's review

* codec: Use anonymous instead of static

* nvdec: Remove enum and fix memory leak

* nvdec: Address ameerj's review

* codec: Remove preparation for threading

Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>
2021-08-03 23:43:11 -04:00
Morph
d16a337d98 Merge pull request #6805 from lat9nq/fix-user-profiles
config: Only read/write current_user on global config
2021-08-02 23:58:41 -04:00
lat9nq
6ab0c6a808 config: Only read/write current_user on global config 2021-08-02 18:29:24 -04:00
Morph
ffe553edbf Merge pull request #6801 from spholz/spholz-patch-1
network: fix ternary operator in Socket::SendTo
2021-08-02 14:56:16 -04:00
spholz
e71f78d04c network: fix ternary operator in Socket::SendTo 2021-08-02 20:12:12 +02:00
yzct12345
f56d0db5bd decoders: Optimize swizzle copy performance (#6790)
This makes UnswizzleTexture up to two times faster. It is the main bottleneck in NVDEC video decoding.
2021-08-02 11:18:58 -04:00
san
3e26141483 yuzu-cmd: hide cursor when in fullscreen
Exposed the SDL_ShowCursor function to EmuWindow baseclass. When creating the window (GL or VK) in fullscreen it now automatically hides the cursor.
2021-08-01 21:46:13 +02:00
Malte Jürgens
381aacdbb1 game_list: Make game list folder icons smaller (#6762)
Makes the default game list folder icons 48x48 by default instead of 64x64, and allows for selecting small (24x24) and large (72x72) icon sizes.
2021-08-01 12:59:36 -04:00
Morph
d20bcb7faf Merge pull request #6793 from Morph1984/lang-fix
service: set: Correct copy amount in GetAvailableLanguageCodes
2021-08-01 12:47:47 -04:00
Morph
3b4d427993 service: set: Correct copy amount in GetAvailableLanguageCodes 2021-08-01 11:59:52 -04:00
Fernando S
30f0b7cf31 Merge pull request #6720 from ameerj/vk-screenshot
renderer_vulkan: Implement screenshots
2021-08-01 13:31:33 +02:00
Ameer J
db32c3762b Merge pull request #6765 from ReinUsesLisp/y-negate-vk
vk_rasterizer: Flip viewport on Y_NEGATE
2021-08-01 01:47:37 -04:00
Ameer J
a086ee6a00 Merge pull request #6565 from lat9nq/bundle-ffmpeg
cmake, ci: Build bundled FFmpeg with yuzu
2021-08-01 01:34:10 -04:00
ameerj
c439fc9be9 astc_decoder: Reduce workgroup size
This reduces the amount of over dispatching when there are odd dimensions (i.e. ASTC 8x5), which rarely evenly divide into 32x32.
2021-08-01 01:22:27 -04:00
ameerj
5ab8053511 astc_decoder: Compute offset swizzles in-shader
Alleviates the dependency on the swizzle table and a uniform which is constant for all ASTC texture sizes.
2021-08-01 01:22:26 -04:00
ameerj
b2862e4772 astc_decoder: Make use of uvec4 for payload data 2021-07-31 22:28:04 -04:00
ameerj
a75d70fa90 astc_decoder: Simplify Select2DPartition 2021-07-31 21:36:26 -04:00
ameerj
5665d05547 astc_decoder: Optimize the use EncodingData
This buffer was a list of EncodingData structures sorted by their bit length, with some duplication from the cpu decoder implementation.
We can take advantage of its sorted property to optimize its usage in the shader.

Thanks to wwylele for the optimization idea.
2021-07-31 21:36:26 -04:00
ameerj
15c0c213b1 astc.h: Move data to cpp implementation
Moves leftover values that are no longer used by the gpu decoder back to the cpp implementation.
2021-07-31 21:26:42 -04:00
Mai M
7cf0958b06 Merge pull request #6788 from Morph1984/hle_api_12.1.0
hle: api_version: Update HOS version to 12.1.0
2021-07-31 20:12:35 -04:00
Morph
9143fe5d3a hle: api_version: Update HOS version to 12.1.0
Keeps us up to date with reporting the system version.
2021-07-31 14:39:49 -04:00
bunnei
47f13a9df4 Merge pull request #6752 from Morph1984/pt-br
service: ns, set: Add PT_BR (Brazilian Portuguese)
2021-07-30 14:42:11 -07:00
bunnei
2c7fdee7a7 Merge pull request #6775 from lat9nq/cmd-remove-global-core
emu_window: Remove global system instance
2021-07-30 13:28:24 -07:00
bunnei
7530594602 Merge pull request #6759 from ReinUsesLisp/pipeline-statistics
renderer_vulkan: Add setting to log pipeline statistics
2021-07-30 11:18:52 -07:00
bunnei
0334b9b776 Merge pull request #6770 from Morph1984/swkbd_buffer_size
applet_swkbd: Correct string buffer size calculation
2021-07-30 09:26:35 -07:00
lat9nq
335de3fdf5 emu_window: Remove global system instance
It was just the one in emu_window_sdl2, but since _gl and _vk inherit
from it, they all needed adjustments.

Leaves just the one auto system& in main().
2021-07-30 10:43:58 -04:00
Morph
ba3d230421 applet_swkbd: Correct string buffer size calculation
The buffer size here does not include the initial 8 bytes.
2021-07-30 02:19:04 -04:00
Morph
275db94bb8 configure_system: Add Brazilian Portuguese to the list of languages 2021-07-30 02:15:53 -04:00
Morph
6ca8ed9e58 service: set: Correct 4.0.0 max_entries to 0x40 (64) instead of 17 2021-07-30 02:15:53 -04:00
Morph
21ff0a3d6e service: ns, set: Add PT_BR (Brazilian Portuguese) 2021-07-30 02:15:53 -04:00
Morph
db07ca6c7f Merge pull request #6767 from ReinUsesLisp/fold-float-pack
shader: Fold UnpackFloat2x16 and PackFloat2x16
2021-07-30 02:07:52 -04:00
bunnei
a98f14e9b0 Merge pull request #6722 from ReinUsesLisp/xmad-opts
shader: Fold integer FMA from Nvidia's pattern
2021-07-29 18:45:37 -07:00
ReinUsesLisp
8c9febe8f7 shader: Fold UnpackFloat2x16 and PackFloat2x16
Simplifies the code a bit when possible. These instructions should be
no-ops codegen wise.
2021-07-29 21:22:52 -03:00
Ameer J
23b3333f72 Merge pull request #6751 from Morph1984/languagecode
service: ns: Map ZH_TW and ZH_CN to Traditional/Simplified Chinese
2021-07-29 14:41:03 -04:00
bunnei
5acf020389 Merge pull request #6742 from Morph1984/uuid
common: uuid: Return a lower-case hex string in Format
2021-07-29 09:37:15 -07:00
ReinUsesLisp
b185567a03 vk_rasterizer: Flip viewport on Y_NEGATE
Matches OpenGL's behavior. I don't believe this register flips geometry,
but we have to try to match behavior on both backends.
2021-07-29 02:17:53 -03:00
ameerj
7ac99bb127 renderers: Add explicit invert_y bool to screenshot callback
OpenGL and Vulkan images render in different coordinate systems. This allows us to specify the coordinate system of the screenshot within each renderer
2021-07-28 21:46:08 -04:00
ameerj
75e7f54fb0 renderer_vulkan: Implement screenshots 2021-07-28 21:45:55 -04:00
ameerj
548bac8989 vk_blit_screen: Add public CreateFramebuffer method 2021-07-28 21:43:02 -04:00
ameerj
1e6c5d323d vk_blit_screen: Make Draw method more generic
Allows specifying the framebuffer and render area dimensions, rather than being hard coded for the render window.
2021-07-28 21:37:30 -04:00
bunnei
124e3b4819 Merge pull request #6760 from ReinUsesLisp/fp16-collect
shader: Mark ConvertF16F32 and ConvertF32F16 as fp16 instructions
2021-07-28 14:55:06 -07:00
bunnei
f771d92e44 Merge pull request #6758 from jbeich/fastmem
host_memory: enable fastmem on FreeBSD
2021-07-28 13:01:54 -07:00
bunnei
d05e6003f0 Merge pull request #6700 from lat9nq/fullscreen-enum
general: Implement FullscreenMode enumeration
2021-07-28 11:36:42 -07:00
Morph
5593a3716e Merge pull request #6671 from jls47/master
applets/web: Addressing QT Navigation issues in Linux
2021-07-27 22:46:27 -04:00
Ameer J
d923ec5805 Merge pull request #6753 from jbeich/libusb
cmake: unbreak libusb detection on FreeBSD
2021-07-27 21:26:17 -04:00
ReinUsesLisp
1bb46b7d64 shader: Mark ConvertF16F32 and ConvertF32F16 as fp16 instructions
Fixes instances where fp16 types are not declared on SPIR-V but they are
used. This shouldn't happen on master, as it's been uncovered by an
additional optimization pass.
2021-07-27 21:33:05 -03:00
ReinUsesLisp
3b006f4fe2 renderer_vulkan: Add setting to log pipeline statistics
Use VK_KHR_pipeline_executable_properties when enabled and available to
log statistics about the pipeline cache in a game.

For example, this is on Turing GPUs when generating a pipeline cache
from Super Smash Bros. Ultimate:

Average pipeline statistics
==========================================
Code size:       6433.167
Register count:    32.939

More advanced results could be presented, at the moment it's just an
average of all 3D and compute pipelines.
2021-07-27 21:29:24 -03:00
bunnei
92887a65f0 Merge pull request #6749 from lioncash/rtarget
render_target: Add missing initializer for size extent
2021-07-27 17:28:53 -07:00
bunnei
6053f2e1fe Merge pull request #6730 from Morph1984/buf_to_stdstring
common: fs: fs_util: Add BufferToUTF8String
2021-07-27 15:50:54 -07:00
Jan Beich
353be2306c host_memory: Add workaround for FreeBSD 12
src/common/host_memory.cpp:360:14: error: use of undeclared identifier
      'memfd_create'
        fd = memfd_create("HostMemory", 0);
             ^
2021-07-27 20:15:23 +00:00
Jan Beich
c4cd82fa7c host_memory: Enable Linux implementation on FreeBSD
HW.Memory <Critical> common/host_memory.cpp:HostMemory:492: Fastmem unavailable, falling back to VirtualBuffer for memory allocation
2021-07-27 20:09:43 +00:00
Rodrigo Locatti
ab206d6378 Merge pull request #6748 from lioncash/engine-init
video_core/engine: Consistently initialize rasterizer pointers
2021-07-27 16:17:20 -03:00
Rodrigo Locatti
5da97c57cd Merge pull request #6744 from lioncash/exc
exception: Make constructors explicit
2021-07-27 16:11:17 -03:00
bunnei
2717e79c74 Merge pull request #6745 from lioncash/copies
video_core: Remove some unused variables
2021-07-27 11:38:32 -07:00
bunnei
e2c42ec5e2 Merge pull request #6747 from lioncash/wrapper
vulkan_wrapper: Fix SetObjectName() always indicating objects as images
2021-07-27 10:18:45 -07:00
jls47
ef29ed75b0 qt_web_browser: Fix lambda capture for HIDButton 2021-07-27 11:31:12 -04:00
jls47
3109d1c3db qt_web_browser: Focus on the first link element
Focusing on the first link element fixes element navigation upon loading the web applet in games such as Super Mario Odyssey
2021-07-27 11:31:11 -04:00
Jan Beich
a24224e274 cmake: don't use pkg-config directly with non-reference libusb
CMake Error at externals/libusb/CMakeLists.txt:120 (add_library):
  Cannot find source file:

    libusb/libusb/core.c

  Tried extensions .c .C .c++ .cc .cpp .cxx .cu .mpp .m .M .mm .h .hh .h++
  .hm .hpp .hxx .in .txx .f .F .for .f77 .f90 .f95 .f03 .ispc

CMake Error at externals/libusb/CMakeLists.txt:120 (add_library):
  No SOURCES given to target: usb

ld: error: undefined symbol: libusb_interrupt_transfer
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::SendVibrations()) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::GetGCEndpoint(libusb_device*)) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::AdapterInputThread()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_error_name
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::SendVibrations()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_control_transfer
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::CheckDeviceAccess()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_kernel_driver_active
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::CheckDeviceAccess()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_close
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::CheckDeviceAccess()) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::ClearLibusbHandle()) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::Reset()) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::Setup()) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::AdapterScanThread()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_detach_kernel_driver
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::CheckDeviceAccess()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_claim_interface
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::CheckDeviceAccess()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_get_config_descriptor
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::GetGCEndpoint(libusb_device*)) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_release_interface
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::ClearLibusbHandle()) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::Reset()) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::Setup()) in archive src/input_common/libinput_common.a
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::AdapterScanThread()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_init
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::Adapter()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_open_device_with_vid_pid
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::Setup()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_get_device
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::Setup()) in archive src/input_common/libinput_common.a

ld: error: undefined symbol: libusb_exit
>>> referenced by gc_adapter.cpp
>>>               gc_adapter.cpp.o:(GCAdapter::Adapter::Reset()) in archive src/input_common/libinput_common.a
2021-07-27 15:01:56 +00:00
Morph
a6359fe9ae service: ns: Remove unused ns_language header 2021-07-27 08:59:26 -04:00
Morph
d7cd316a6c service: ns: Map ZH_TW and ZH_CN to Traditional/Simplified Chinese 2021-07-27 08:54:41 -04:00
Lioncash
00e100de08 render_target: Add missing initializer for size extent
Everything else has a default constructor that does the straightforward
thing of initializing most members to a default value, except for the
size.

We explicitly initialize the size (and others, for consistency), to
prevent potential uninitialized reads from occurring. Particularly given
the largeish surface area that this struct is used in.
2021-07-27 07:41:21 -04:00
Lioncash
f8964dd89a video_core/engine: Consistently initialize rasterizer pointers
Ensures all of the engines have consistent and deterministic
initialization of the rasterizer pointers.
2021-07-27 07:30:57 -04:00
Lioncash
8c82c594f0 vulkan_wrapper: Fix SetObjectName() always indicating objects as images
We should be using the passed in object type instead.
2021-07-27 07:19:15 -04:00
Lioncash
ec56a17acd buffer_cache: Remove unused small_vector in CommitAsyncFlushesHigh()
Given this is non-trivial, the constructor is required to execute, so
this removes a bit of redundant codegen.
2021-07-27 06:24:44 -04:00
Lioncash
075a744e38 gl_shader_cache: Remove unused variable 2021-07-27 06:23:49 -04:00
Lioncash
296728ec46 vk_compute_pass: Remove unused captures
Resolves two compiler warnings.
2021-07-27 06:17:52 -04:00
Lioncash
c27ddb44de exception: Make constructors explicit
Ensures that exception construction is always explicit.
2021-07-27 04:15:14 -04:00
Lioncash
e490ddf327 exception: Make what() member function nodiscard 2021-07-27 04:14:32 -04:00
Lioncash
90f3678ada exception: Narrow down specific header
We can use the <exception> header instead of pulling in all of the
exception-style classes.
2021-07-27 04:09:18 -04:00
Morph
f5f04cce01 common: fs: fs_util: Add BufferToUTF8String
Allows for direct conversion to std::string without having to convert std::u8string to std::string
2021-07-27 00:02:22 -04:00
Morph
b9b48aee7d common: uuid: Return a lower-case hex string in Format 2021-07-26 23:54:59 -04:00
bunnei
d6c799494c Merge pull request #6696 from ameerj/speed-limit-rename
general: Rename "Frame Limit" references to "Speed Limit"
2021-07-26 18:51:00 -07:00
Rodrigo Locatti
7511f65e31 Merge pull request #6741 from ReinUsesLisp/stream-remove
vk_stream_buffer: Remove unused stream buffer
2021-07-26 20:35:01 -03:00
Rodrigo Locatti
1a94b3f452 Merge pull request #6740 from K0bin/hvv-fallback
Handle allocation failure in Staging buffer
2021-07-26 20:34:44 -03:00
Robin Kertels
75050c788c vk_staging_buffer_pool: Fall back to host memory when allocation fails 2021-07-26 23:37:18 +02:00
Rodrigo Locatti
c6991fa900 Merge pull request #6728 from ReinUsesLisp/null-buffer-usage
vk_buffer_cache: Add transform feedback usage to null buffer
2021-07-26 18:30:45 -03:00
Rodrigo Locatti
6c432d5349 Merge pull request #6729 from ReinUsesLisp/quad-indexed-barrier
vk_compute_pass: Fix pipeline barrier for indexed quads
2021-07-26 18:30:05 -03:00
ReinUsesLisp
5f9a4817a5 vk_stream_buffer: Remove unused stream buffer
Remove unused file.
2021-07-26 18:19:53 -03:00
Rodrigo Locatti
c0f99558fb Merge pull request #6724 from lioncash/nodisc-shader
shader_recompiler: Remove unnecessary [[nodiscard]] instances
2021-07-26 16:35:21 -03:00
Rodrigo Locatti
de0b89792c Merge pull request #6726 from lioncash/hguard
emit_spirv_instructions: Add missing header guard
2021-07-26 16:35:11 -03:00
Rodrigo Locatti
3d97f1e6cf Merge pull request #6727 from lioncash/topology
emit_glasm: Fix LINESS_ADJACENCY typo in InputPrimitive()
2021-07-26 16:35:03 -03:00
bunnei
7490117fa4 Merge pull request #6736 from CaptV0rt3x/patch-1
Config-graphics: reword GLASM option
2021-07-26 11:27:46 -07:00
Vamsi Krishna
c05bbf375d configure_graphics: reword GLASM option
Change wording to explain that GLASM is actually short for Assembly Shaders
2021-07-26 20:49:31 +05:30
Rodrigo Locatti
b2b3fcdccd Merge pull request #6723 from lioncash/shader
object_pool: Add missing return in Chunk move assignment operator
2021-07-26 06:01:21 -03:00
Rodrigo Locatti
4afc2de129 Merge pull request #6725 from lioncash/control-token
control_flow: Fix duplicate switch case in OpcodeToken
2021-07-26 06:00:23 -03:00
ReinUsesLisp
771dcb2a56 vk_compute_pass: Fix pipeline barrier for indexed quads
Use an index buffer barrier instead of a vertex input read barrier.
2021-07-26 05:51:09 -03:00
ReinUsesLisp
27ed6e7c2b vk_buffer_cache: Add transform feedback usage to null buffer
Fixes bad API usages on Vulkan.
2021-07-26 05:49:37 -03:00
Lioncash
3e7813e49d emit_glasm: Fix LINESS_ADJACENCY typo in InputPrimitive()
This should be LINES_ADJACENCY
2021-07-26 04:44:56 -04:00
Lioncash
c2915d9f2f emit_spirv_instructions: Add missing header guard 2021-07-26 04:28:35 -04:00
Lioncash
06ca911621 shader_recompiler: Remove unnecessary [[nodiscard]] instances
[[nodiscard]] doesn't do anything on functions with a void return type
and causes superfluous warnings.
2021-07-26 04:23:59 -04:00
Lioncash
0b67df1f7c control_flow: Fix duplicate switch case in OpcodeToken
This previously duplicated the case of the PBK case above it.
2021-07-26 04:16:34 -04:00
Lioncash
89ad9df0e9 object_pool: Add missing return in Chunk move assignment operator
Prevents undefined behavior from occurring.
2021-07-26 04:01:05 -04:00
ReinUsesLisp
66a0cedba3 shader: Fold integer FMA from Nvidia's pattern
Fold shaders doing "a * b + c" on integers from the pattern generated by
Nvidia's GL compiler.

On a somewhat complex compute shader it reduces the code size by 16
instructions from 2 matches on Turing GPUs.

On Intel as extracted from KHR_pipeline_executable_properties:
Before the optimization:
```
Instruction Count: 2057
Basic Block Count: 45
Scratch Memory Size: 14752
Spill Count: 232
Fill Count: 261
SEND Count: 610
Cycle Count: 11325
```

After the optimization:
```
Instruction Count: 2046
Basic Block Count: 44
Scratch Memory Size: 13728
Spill Count: 219
Fill Count: 268
SEND Count: 604
Cycle Count: 11367
```
2021-07-26 04:58:02 -03:00
ReinUsesLisp
09fb41dc63 shader: Use TryInstRecursive on XMAD multiply folding
Simplify a bit the logic.
2021-07-26 04:15:27 -03:00
ReinUsesLisp
f6f0383b49 shader: Add TryInstRecursive utility to values 2021-07-26 01:31:05 -03:00
bunnei
c09557acd8 Merge pull request #6697 from ameerj/fps-cap
config, nvflinger: Add FPS cap setting
2021-07-25 16:23:44 -07:00
lat9nq
09d6cc9943 Merge branch 'master' into fullscreen-enum 2021-07-25 15:31:33 -04:00
bunnei
7e272d3cd8 Merge pull request #6575 from FernandoS27/new_settings
Settings: Eliminate ASYNC & MULTICORE Toggles and add GPU Accuracy to  status bar
2021-07-25 11:45:45 -07:00
Morph
b5c3cb8763 Merge pull request #6709 from ameerj/screenshot-path
main: Fix screenshot filepath construction
2021-07-25 14:41:38 -04:00
bunnei
98b26b6e12 Merge pull request #6585 from ameerj/hades
Shader Decompiler Rewrite
2021-07-25 11:39:04 -07:00
ameerj
9dd35b7b66 main: Fix screenshot filepath construction
The screenshot directory path returned does not have a trailing directory separator character. This caused screenshots to be saved in the parent directory of the configured screenshot directory.

This fixes that behavior
2021-07-25 14:24:08 -04:00
bunnei
c2aaf51370 Merge pull request #6699 from lat9nq/common-threads
common: Publically link to pthreads
2021-07-25 10:43:11 -07:00
Fernando S
e7c30f33fe Merge pull request #6706 from FernandoS27/skyline-love-letter
Grant a partial license exception to Skyline Emulator.
2021-07-25 12:24:29 +02:00
Fernando Sahmkow
e6a0ca5f2c Grant a license exception to Skyline Emulator. 2021-07-25 02:09:42 +02:00
bunnei
84b9c42642 Merge pull request #6690 from ReinUsesLisp/dma-clear-fixups
buffer_cache: Misc fixups related to buffer clears
2021-07-24 01:27:50 -04:00
ameerj
c80ae87b4e renderer_base: Removed redundant settings
use_framelimiter was not being used internally by the renderers.
set_background_color was always set to true as there is no toggle for the renderer background color, instead users directly choose the color of their choice.
2021-07-23 22:10:01 -04:00
ameerj
9dfbc9bdce general: Rename "Frame Limit" references to "Speed Limit"
This setting is best referred to as a speed limit, as it involves the limits of all timing based aspects of the emulator, not only framerate.
This allows us to differentiate it from the fps unlocker setting.
2021-07-23 22:10:01 -04:00
ameerj
2c6e274b39 config, nvflinger: Add FPS cap setting
Allows finer tuning of the FPS limit.
2021-07-23 22:04:36 -04:00
bunnei
2656020608 Merge pull request #6551 from bunnei/improve-kernel-obj
Improve management of kernel objects
2021-07-23 21:23:56 -04:00
lat9nq
d8b00fd863 configuration: Use combobox apply template where possible
We don't need to manually apply this setting now that a template can do
this for us.
2021-07-23 10:18:07 -04:00
lat9nq
b11c81cc13 general: Implement FullscreenMode enumeration
Prevents us from using an unclear 0 or 1 to describe the fullscreen
mode.
2021-07-23 10:14:37 -04:00
lat9nq
eb61824ea5 common: Publically link to pthreads
Common requires pthreads but does not refer to it when linking to other
modules. Fix this by linking to Threads where necessary.
2021-07-23 09:47:52 -04:00
ReinUsesLisp
7f13104c17 shader: Support out of bound local memory reads and immediate writes
Support ignoring immediate out of bound writes. Writing dynamically out
of bounds is not yet supported (e.g. R0+0x4).

Reading out of bounds yields zero. This is supported checking for the
size from the IR; if the input is immediate, the optimization passes
will drop it.
2021-07-22 21:51:41 -04:00
ReinUsesLisp
a55ff22900 vulkan/blit_image: Commit descriptor sets within worker thread
Fixes race condition caused. The descriptor pool is not thread safe, so
we have to commit descriptor sets within the same thread.
2021-07-22 21:51:40 -04:00
ReinUsesLisp
f6796cad9c vulkan_device: Blacklist Volta and older from VK_KHR_push_descriptor
Causes crashes on Link's Awakening intro. It's hard to debug if it's our
fault due to bugs in validation layers.
2021-07-22 21:51:40 -04:00
ReinUsesLisp
594ea29015 cmake: Remove unused code in GenerateSCMRev.cmake
Remove shader code hash generation code as it's no longer used.
2021-07-22 21:51:40 -04:00
ReinUsesLisp
a741513e65 qt: Remove "experimental" from asynchronous shader building UI 2021-07-22 21:51:40 -04:00
ReinUsesLisp
3c6d440015 Revert "renderers: Disable async shader compilation"
This reverts commit 4a152767286717fa69bfc94846a124a366f70065.
2021-07-22 21:51:40 -04:00
ReinUsesLisp
8381490a04 opengl: Fix asynchronous shaders
Wait for shader to build before configuring it, and wait for the shader
to build before sharing it with other contexts.
2021-07-22 21:51:40 -04:00
ReinUsesLisp
258f35515d shader_environment: Receive cache version from outside
This allows us invalidating OpenGL and Vulkan separately in the future.
2021-07-22 21:51:40 -04:00
ReinUsesLisp
4a82450c81 cmake: Remove shader cache version 2021-07-22 21:51:40 -04:00
ameerj
56478bc9ac shader: Fix disabled attribute default values 2021-07-22 21:51:40 -04:00
ameerj
c9528282d9 gl_device: Simplify GLASM setting logic 2021-07-22 21:51:40 -04:00
ameerj
56c30dd9e0 glsl: Simplify FCMP emission 2021-07-22 21:51:40 -04:00
ameerj
79d2684261 glsl: Update TessellationControl gl_in
Adheres to GL_ARB_separate_shader_objects requirements
2021-07-22 21:51:40 -04:00
ReinUsesLisp
e1ed218b41 renderer_opengl: Use ARB_separate_shader_objects
Ensures that states set for a particular stage are not attached to other
stages which may not need them.
2021-07-22 21:51:40 -04:00
ameerj
fc7bed21b5 shader: Implement ISETP.X 2021-07-22 21:51:40 -04:00
ReinUsesLisp
bf2956d77a shader: Avoid usage of C++20 ranges to build in clang 2021-07-22 21:51:40 -04:00
ameerj
94af0a00f6 glsl: Clamp shared mem size to GL_MAX_COMPUTE_SHARED_MEMORY_SIZE 2021-07-22 21:51:40 -04:00
ReinUsesLisp
8c166c68d4 gl_shader_cache: Properly implement asynchronous shaders 2021-07-22 21:51:40 -04:00
lat9nq
49946cf780 shader_recompiler, video_core: Resolve clang errors
Silences the following warnings-turned-errors:
-Wsign-conversion
-Wunused-private-field
-Wbraced-scalar-init
-Wunused-variable

And some other errors
2021-07-22 21:51:40 -04:00
ameerj
4e4b8775b5 main: Update Shader Cache menu options
This change adds two new context menu items to remove either the OpenGL or the Vulkan shader caches individually, and the provides the option to remove all caches for the selected title.
This also changes the behavior of the open shader cache option. Now it creates the shader cache directory for the title if it does not yet exist.
2021-07-22 21:51:40 -04:00
ameerj
41493fbe89 renderers: Fix clang formatting 2021-07-22 21:51:40 -04:00
ReinUsesLisp
2235a51b5d shader: Manually convert from array<u32> to bitset instead of using bit_cast 2021-07-22 21:51:40 -04:00
ameerj
8390286a89 renderers: Disable async shader compilation
The current implementation is prone to causing graphical issues. Disable until a better solution is implemented.
2021-07-22 21:51:40 -04:00
ReinUsesLisp
be54aad1c4 maxwell_to_vk: Add R16_SNORM 2021-07-22 21:51:40 -04:00
lat9nq
18fb9bdfa8 configure_graphics: Mark SPIR-V as Experimental, Mesa only 2021-07-22 21:51:40 -04:00
ameerj
41c6cb70f9 glsl: Fix tracking of info.uses_shadow_lod 2021-07-22 21:51:40 -04:00
ameerj
11f04f1022 shader: Ignore global memory ops on devices lacking int64 support 2021-07-22 21:51:40 -04:00
lat9nq
55233c2861 vulkan_device: Add missing include algorithm 2021-07-22 21:51:40 -04:00
ameerj
7277d7fe96 vulkan_device: Blacklist ampere devices from float16 math 2021-07-22 21:51:40 -04:00
ameerj
57f222c56e dual_vertex_pass: Clang format 2021-07-22 21:51:40 -04:00
ameerj
dbee32d302 gl_shader_cache: Fixes for async shaders 2021-07-22 21:51:40 -04:00
ReinUsesLisp
57171b23f9 vulkan_device: Enable VK_EXT_extended_dynamic_state on RADV 21.2 onward 2021-07-22 21:51:40 -04:00
ReinUsesLisp
8722668b3c emit_spirv: Workaround VK_KHR_shader_float_controls on fp16 Nvidia
Fix regression on Fire Emblem: Three Houses when using native fp16.
2021-07-22 21:51:40 -04:00
lat9nq
1b27a2b597 configure_graphics: Re-order vulkan device populating 2021-07-22 21:51:40 -04:00
lat9nq
2e5af95541 shader: GCC fmt 8.0.0 fixes 2021-07-22 21:51:40 -04:00
ameerj
b9069c7891 shader: Account for 33-bit IADD3 scenario 2021-07-22 21:51:40 -04:00
ReinUsesLisp
b21bf79bd2 shader: Only apply shift on register mode for IADD3 2021-07-22 21:51:39 -04:00
ReinUsesLisp
fba6bd92d4 vk_rasterizer: Workaround bug in VK_EXT_vertex_input_dynamic_state
Workaround potential bug on Nvidia's driver where only updating high
attributes leaves low attributes out dated.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
5643a909bc shader: Fix disabled and unwritten attributes and varyings 2021-07-22 21:51:39 -04:00
ameerj
65daec8b75 glsl: Fix shared and local memory declarations
account for the fact that program.*memory_size is in units of bytes.
2021-07-22 21:51:39 -04:00
ameerj
8289eb108f opengl: Implement LOP.CC
Used by MH:Rise
2021-07-22 21:51:39 -04:00
ReinUsesLisp
f94f0be521 vk_graphics_pipeline: Implement smooth lines 2021-07-22 21:51:39 -04:00
ReinUsesLisp
57a8921e01 vk_graphics_pipeline: Implement line width 2021-07-22 21:51:39 -04:00
ReinUsesLisp
5b2b0634a1 spirv: Fix code emission when descriptor aliasing is unsupported
Fixes OpenGL.
2021-07-22 21:51:39 -04:00
lat9nq
fb9b1787f8 video_core: Enable GL SPIR-V shaders 2021-07-22 21:51:39 -04:00
lat9nq
1152d66ddd general: Add setting shader_backend
GLASM is getting good enough that we can move it out of advanced
graphics settings. This removes the setting `use_assembly_shaders`,
opting for a enum class `shader_backend`. This comes with the benefits
that it is extensible for additional shader backends besides GLSL and
GLASM, and this will work better with a QComboBox.

Qt removes the related assembly shader setting from the Advanced
Graphics section and places it as a new QComboBox in the API Settings
group. This will replace the Vulkan device selector when OpenGL is
selected.

Additionally, mark all of the custom anisotropic filtering settings as
"WILL BREAK THINGS", as that is the case with a select few games.
2021-07-22 21:51:39 -04:00
ameerj
00fa09dc45 glsl: Declare local memory in main 2021-07-22 21:51:39 -04:00
ameerj
f7352411f0 glsl: Add passthrough geometry shader support 2021-07-22 21:51:39 -04:00
ReinUsesLisp
8612b5fec5 shader: Use std::bit_cast instead of Common::BitCast for passthrough 2021-07-22 21:51:39 -04:00
ReinUsesLisp
8a3427a4c8 glasm: Add passthrough geometry shader support 2021-07-22 21:51:39 -04:00
ReinUsesLisp
7dafa96ab5 shader: Rework varyings and implement passthrough geometry shaders
Put all varyings into a single std::bitset with helpers to access it.

Implement passthrough geometry shaders using host's.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
4f052a1f39 vk_graphics_pipeline: Implement conservative rendering 2021-07-22 21:51:39 -04:00
ReinUsesLisp
ecd6b4356b shader: Only verify shader when graphics debugging is enabled 2021-07-22 21:51:39 -04:00
ReinUsesLisp
395bed3a0a shader: Unify shader stage types 2021-07-22 21:51:39 -04:00
lat9nq
257d2aab74 lower_int64_to_int32: Add missing include 2021-07-22 21:51:39 -04:00
ReinUsesLisp
fb166b5ff4 shader: Emulate 64-bit integers when not supported
Useful for mobile and Intel Xe devices.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
d8d5501459 shader: Add int64 to int32 lowering pass 2021-07-22 21:51:39 -04:00
ReinUsesLisp
04ef2160f9 shader: Teach global memory base tracker to follow vectors 2021-07-22 21:51:39 -04:00
ReinUsesLisp
97e80dda55 shader: Add constant propagation to integer vectors 2021-07-22 21:51:39 -04:00
ameerj
27ca8a0e13 glsl: Better IAdd Overflow CC fix
This ensures the original operand values are not overwritten when being used in the overflow detection.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
4397053d5c shader: Remove IAbs64 2021-07-22 21:51:39 -04:00
ameerj
bc6e399ae3 glsl: Fix IADD CC 2021-07-22 21:51:39 -04:00
ameerj
a7536825df shader_recompiler: Fix IADD3 input partitioning 2021-07-22 21:51:39 -04:00
ReinUsesLisp
808ef97a08 shader: Move loop safety tests to code emission 2021-07-22 21:51:39 -04:00
ReinUsesLisp
3877918e96 gl_graphics_pipeline: Fix assembly shaders check for transform feedbacks 2021-07-22 21:51:39 -04:00
ameerj
cbce9ddd4a glsl: Remove frag color initialization 2021-07-22 21:51:39 -04:00
ameerj
3a2dd1b483 glasm: Implement SetAttribute ViewportMask 2021-07-22 21:51:39 -04:00
ReinUsesLisp
9bd0531384 gl_graphics_pipeline: Inline hash and operator== key functions 2021-07-22 21:51:39 -04:00
ReinUsesLisp
f5db8c7440 gl_shader_cache: Check previous pipeline before checking hash map
Port optimization from Vulkan.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
218dedca1f gl_graphics_pipeline: Port optimizations from Vulkan pipelines 2021-07-22 21:51:39 -04:00
ameerj
1c648f176c emit_glsl_special: Skip initialization of frag_color0
Fixes rendering in Devil May Cry without regressing Ori and the Blind Forest.
2021-07-22 21:51:38 -04:00
ReinUsesLisp
1d182fc0f5 shader: Calibrate loop safety threshold 2021-07-22 21:51:38 -04:00
ReinUsesLisp
df9b7e18f5 buffer_cache: Fix debugging leftover 2021-07-22 21:51:38 -04:00
Morph
cfbc85839d glsl: Add missing ; in EmitSetSampleMask
Fixes shader compilation in Okami HD
2021-07-22 21:51:38 -04:00
ReinUsesLisp
838d7e4ca5 buffer_cache: Fix size reductions not having in mind bind sizes
A buffer binding can change between shaders without changing the
shaders. This lead to outdated bindings on OpenGL.
2021-07-22 21:51:38 -04:00
ameerj
9e066dcb15 glsl: Fix output varying initialization when transform feedback is used 2021-07-22 21:51:38 -04:00
ameerj
fcff19e0fa shaders: Allow shader notify when async shaders is disabled 2021-07-22 21:51:38 -04:00
ameerj
a0365217f5 texture_pass: Fix is_read image qualification
Atomic operations are considered to have both read and write access. This was not  being accounted for.
2021-07-22 21:51:38 -04:00
ReinUsesLisp
0cd08b3e72 shader: Align constant buffer sizes to 16 bytes
WAR for AMD reading zeroes on uniform buffers of size 2.
2021-07-22 21:51:38 -04:00
ReinUsesLisp
59fead3a47 spirv: Properly handle devices without int8 and int16 2021-07-22 21:51:38 -04:00
ReinUsesLisp
b5e78607ad spirv: Handle small storage buffer loads on devices with no support 2021-07-22 21:51:38 -04:00
ReinUsesLisp
ca67077ca8 vk_graphics_pipeline: Use VK_KHR_push_descriptor when available
~51% faster on Nvidia compared to previous method.
2021-07-22 21:51:38 -04:00
ameerj
ccbd24fe00 glsl: Fix cbuf component indexing bug falback 2021-07-22 21:51:38 -04:00
ReinUsesLisp
1091995f8e shader: Simplify MergeDualVertexPrograms 2021-07-22 21:51:38 -04:00
ReinUsesLisp
374eeda1a3 shader: Properly manage attributes not written from previous stages 2021-07-22 21:51:38 -04:00
ReinUsesLisp
892b8aa2ad glsl: Only declare fragment outputs on fragment shaders 2021-07-22 21:51:38 -04:00
ReinUsesLisp
0ffea97e2e shader: Split profile and runtime info headers 2021-07-22 21:51:38 -04:00
ReinUsesLisp
cbbca26d18 shader: Add support for native 16-bit floats 2021-07-22 21:51:38 -04:00
ReinUsesLisp
376aa94819 shader: Rename maxwell/program.h to translate_program.h 2021-07-22 21:51:38 -04:00
ReinUsesLisp
69f9b97e7e vulkan_device: Blacklist VK_EXT_vertex_input_dynamic_state on Intel 2021-07-22 21:51:38 -04:00
ameerj
12ef06ba8b glsl: Obey need_declared_frag_colors to declare and initialize all frag_color
Fixes Ori and the blind forest title screen
2021-07-22 21:51:38 -04:00
ameerj
d36f667bc0 glsl: Address rest of feedback 2021-07-22 21:51:38 -04:00
ameerj
c5dfa0b630 glsl: Move gl_Position/generic attribute initialization to EmitProlgue 2021-07-22 21:51:38 -04:00
ameerj
3b339fbbf6 glsl: Conditionally use fine/coarse derivatives based on device support 2021-07-22 21:51:38 -04:00
ameerj
6eea88d614 glsl: Cleanup/Address feedback 2021-07-22 21:51:38 -04:00
ameerj
74f683787e gl_shader_cache: Implement async shaders 2021-07-22 21:51:38 -04:00
ameerj
ae4e452759 glsl: Add Shader_GLSL logging 2021-07-22 21:51:38 -04:00
ameerj
6c6a451d6a glsl: Add LoopSafety instructions 2021-07-22 21:51:38 -04:00
ameerj
a0d0704aff glsl: Conditionally add EXT_texture_shadow_lod 2021-07-22 21:51:38 -04:00
ameerj
5e7b2b9661 glsl: Add stubs for sparse queries and variable aoffi when not supported 2021-07-22 21:51:38 -04:00
ameerj
6aa1bf7b6f glsl: Implement legacy varyings 2021-07-22 21:51:38 -04:00
ameerj
ff3de0fb6b gl_shader_cache: Remove const from pipeline source arguments 2021-07-22 21:51:38 -04:00
ameerj
413eb6983f gl_shader_cache: Move OGL shader compilation to the respective Pipeline constructor 2021-07-22 21:51:38 -04:00
ameerj
39c29664f9 glsl: Minor cleanup 2021-07-22 21:51:38 -04:00
ameerj
427a2596a1 glsl: Fix Cbuf getters for F32 type 2021-07-22 21:51:38 -04:00
ameerj
7c82f20b52 glsl: Add immediate index oob checking for Cbuf getters 2021-07-22 21:51:38 -04:00
ameerj
84c86e03cd glsl: Refactor GetCbuf functions to reduce code duplication 2021-07-22 21:51:38 -04:00
ameerj
e81c73a874 glsl: Address more feedback. Implement indexed texture reads 2021-07-22 21:51:38 -04:00
ameerj
7d89a82a48 glsl: Remove Signed Integer variables 2021-07-22 21:51:38 -04:00
ameerj
4759db28d0 glsl: Address Rodrigo's feedback 2021-07-22 21:51:38 -04:00
ameerj
85399e119d glsl: Reorganize backend code, remove unneeded [[maybe_unused]] 2021-07-22 21:51:37 -04:00
ameerj
e7c8f8911f glsl: Implement SampleId and SetSampleMask
plus some minor refactoring of implementations
2021-07-22 21:51:37 -04:00
ameerj
d1a68f7997 glsl: Add gl_PerVertex in for GS 2021-07-22 21:51:37 -04:00
ameerj
a926695234 glsl: Use existing tracking for enabling EXT_shader_image_load_formatted 2021-07-22 21:51:37 -04:00
ameerj
14bd73db36 glsl: Enable early fragment tests 2021-07-22 21:51:37 -04:00
ameerj
6650c4799d gl_rasterizer: Add texture fetch barrier for fragments
Fixes flicker seen in XC2
2021-07-22 21:51:37 -04:00
ameerj
3f31a547e0 glsl: Implement more attribute getters and setters 2021-07-22 21:51:37 -04:00
ameerj
8bb8bbf4ae glsl: Implement fswzadd
and wip nv thread shuffle impl
2021-07-22 21:51:37 -04:00
ameerj
c542204113 glsl: Implement indexed attribute loads 2021-07-22 21:51:37 -04:00
ameerj
2a504b4765 glsl: Conditionally add GL_ARB_sparse_texture2 2021-07-22 21:51:37 -04:00
ameerj
970fc39d98 glsl: Rebase fixes 2021-07-22 21:51:37 -04:00
ameerj
fc0db612ab glsl: Conditionally use GL_EXT_shader_image_load_formatted
Fix for SULD.D
2021-07-22 21:51:37 -04:00
ameerj
fb839061fb glsl: Remove output generic indexing for geometry stage 2021-07-22 21:51:37 -04:00
ameerj
258106038e glsl: Allow dynamic tracking of variable allocation 2021-07-22 21:51:37 -04:00
ameerj
465903468e glsl: Implement barriers 2021-07-22 21:51:37 -04:00
ameerj
421847cf1e glsl: Implement image atomics and set layer
along with some more cleanup/oversight fixes
2021-07-22 21:51:37 -04:00
ameerj
d41aef03c7 glsl: Fix image gather logic 2021-07-22 21:51:37 -04:00
ameerj
35e78d558d glsl: Add cbuf access workaround for devices with component indexing bug 2021-07-22 21:51:37 -04:00
ameerj
747b8556a4 glsl: Use textureGrad fallback when EXT_texture_shadow_lod is unsupported 2021-07-22 21:51:37 -04:00
ameerj
d12f2b8ccf emit_glsl_image: Use immediate offsets when possible 2021-07-22 21:51:37 -04:00
ameerj
0a0b0a73d8 glsl: Fix <32-bit SSBO writes
and more cleanup
2021-07-22 21:51:37 -04:00
ameerj
34fdb6471d glsl: Cleanup and address feedback 2021-07-22 21:51:37 -04:00
ameerj
5355568a2d glsl: Refactor Global memory functions 2021-07-22 21:51:37 -04:00
ameerj
a68fabf6d5 glsl: Increase NUM_VARS that can be allocated
needed for HW:AoC.
2021-07-22 21:51:37 -04:00
ameerj
8d8ce24f20 glsl: Implement Load/WriteGlobal
along with some other misc changes and fixes
2021-07-22 21:51:37 -04:00
ameerj
af9696059c glsl: Implement Images 2021-07-22 21:51:37 -04:00
ameerj
6577a63d36 glsl: skip gl_ViewportIndex write if device does not support it 2021-07-22 21:51:37 -04:00
ameerj
f4799e8fa1 glsl: Implement transform feedback 2021-07-22 21:51:37 -04:00
ameerj
31147ffe69 glsl: Yet another gl_ViewportIndex fix attempt 2021-07-22 21:51:37 -04:00
ameerj
9f3970f837 glsl: Add gl_ViewportIndex out attribute 2021-07-22 21:51:37 -04:00
lat9nq
fc29de7d5b emit_glsl_context_get_set: Remove unused function 2021-07-22 21:51:37 -04:00
ameerj
59576b82a8 glsl: Fix precise variable declaration
and add some more separation in the shader for better debugability when dumped
2021-07-22 21:51:37 -04:00
ameerj
8c684b3e23 glsl: Implement tessellation shaders 2021-07-22 21:51:37 -04:00
ameerj
c7d085b505 glsl: Implement ImageGradient and other texture function variants 2021-07-22 21:51:37 -04:00
ameerj
68d075d1e8 glsl: Fix atomic SSBO offsets
and implement misc getters
2021-07-22 21:51:37 -04:00
ameerj
19247ba4fa glsl: Implement geometry shaders 2021-07-22 21:51:37 -04:00
ameerj
df53046d68 glsl: Use NotImplemented macro with function name output 2021-07-22 21:51:37 -04:00
ameerj
3a024b3026 glsl: Implement gl_ViewportIndex
SSBU now working
2021-07-22 21:51:37 -04:00
ameerj
b7561226ed glsl: SHFL fix and prefer shift operations over divide in glsl shader 2021-07-22 21:51:37 -04:00
ameerj
e10366974e glsl: Implement precise fp variable allocation 2021-07-22 21:51:37 -04:00
ameerj
14bfb4719a HACK glsl: Write defaults to unused generic attributes 2021-07-22 21:51:37 -04:00
ameerj
4b5a4ea72e glsl: Fix ssbo indexing and name shadowing between shader stages 2021-07-22 21:51:37 -04:00
ameerj
8ec0028e68 glsl: implement set clip distance
and missed a diff in emit_glsl relating to var alloc ref counting
2021-07-22 21:51:37 -04:00
ameerj
9f3ffb996b glsl: Rework var alloc to not assign unused results 2021-07-22 21:51:37 -04:00
ameerj
1269a0cf8b glsl: Rework variable allocator to allow for variable reuse 2021-07-22 21:51:37 -04:00
ameerj
9ccbd74991 glsl: Fix ATOM and implement ATOMS 2021-07-22 21:51:37 -04:00
ameerj
68ef3803bf glsl: Use gl_SubGroupInvocationARB 2021-07-22 21:51:36 -04:00
ameerj
e35ffbbeb0 glsl: Implement VOTE for subgroup size potentially larger 2021-07-22 21:51:36 -04:00
ameerj
770b754afd glsl: Implement VOTE 2021-07-22 21:51:36 -04:00
ameerj
181a4ffdc4 glsl: Implement ST{LS} 2021-07-22 21:51:36 -04:00
ameerj
57d354b02c glsl: Implement more instructions used by SMO 2021-07-22 21:51:36 -04:00
ameerj
7df0815117 glsl: Implement more instructions used by SMO 2021-07-22 21:51:36 -04:00
ameerj
80eec85867 glsl: Fix GetAttribute return values
fixes font rendering issues as these were used to index into the ssbos
2021-07-22 21:51:36 -04:00
ameerj
1542f31e79 glsl: minor cleanup 2021-07-22 21:51:36 -04:00
ameerj
005eecffcd glsl: Fix and implement rest of cbuf access 2021-07-22 21:51:36 -04:00
ameerj
3047eb6688 glsl: Implement TXQ and other misc changes 2021-07-22 21:51:36 -04:00
ameerj
5fd92780b2 glsl: TLD4 implementation 2021-07-22 21:51:36 -04:00
ameerj
697eacd095 glsl: Implement TLD instruction 2021-07-22 21:51:36 -04:00
ameerj
e4ba755705 glsl: Implement TEXS 2021-07-22 21:51:36 -04:00
ameerj
59a692e9ed glsl: Cleanup texture functions 2021-07-22 21:51:36 -04:00
lat9nq
c9a25855bc shader_recompiler: GCC fixes 2021-07-22 21:51:36 -04:00
ameerj
7619b7d427 glsl: Implement TEX depth functions 2021-07-22 21:51:36 -04:00
ameerj
55e0211a5e glsl: Implement TEX ImageSample functions 2021-07-22 21:51:36 -04:00
ameerj
b98de76ea8 glsl: Rework Shuffle emit instructions to align with SPIR-V 2021-07-22 21:51:36 -04:00
ameerj
8ba814efb2 glsl: Better Storage access and wip warps 2021-07-22 21:51:36 -04:00
ameerj
86d4a05cec glsl: Fix integer conversions, implement clamp CC 2021-07-22 21:51:36 -04:00
ameerj
21797efa54 glsl: Implement IADD CC 2021-07-22 21:51:36 -04:00
ameerj
453cd25da5 glsl: SSBO access fixes and wip SampleExplicitLod implementation. 2021-07-22 21:51:36 -04:00
ameerj
f6bbc76336 glsl: WIP var forward declaration
to fix Loop control flow.
2021-07-22 21:51:36 -04:00
ameerj
2a71333716 glsl: Fix bindings, add some CC ops 2021-07-22 21:51:36 -04:00
ameerj
6674637853 glsl: remove unused headers 2021-07-22 21:51:36 -04:00
ameerj
a752ec88d0 glsl: Implement derivatives and YDirection
plus some other misc additions/changed
2021-07-22 21:51:36 -04:00
ameerj
ed14d31f66 glsl: Fix non-immediate buffer access
and many other misc implementations
2021-07-22 21:51:36 -04:00
ameerj
d171083d53 glsl: textures wip 2021-07-22 21:51:36 -04:00
ameerj
3d086e6130 glsl: Implement some attribute getters and setters 2021-07-22 21:51:36 -04:00
ameerj
5399906c26 glsl: Track S32 atomics 2021-07-22 21:51:36 -04:00
ameerj
b95716e543 glsl: Update phi node management 2021-07-22 21:51:36 -04:00
ameerj
67f881e714 glsl: Fix floating point compare ops
Logic for ordered/unordered ops was wrong.
2021-07-22 21:51:36 -04:00
ameerj
bd24fa9713 glsl: Query GL Device for FP16 extension support 2021-07-22 21:51:36 -04:00
ameerj
3482df1176 glsl: Simply FP storage atomics 2021-07-22 21:51:36 -04:00
ameerj
9cc1b8a873 glsl: F16x2 storage atomics 2021-07-22 21:51:36 -04:00
ameerj
11ba190462 glsl: Revert ssbo aliasing. Storage Atomics impl 2021-07-22 21:51:36 -04:00
ameerj
e99d01ff53 glsl: implement phi nodes 2021-07-22 21:51:36 -04:00
ameerj
3d9ecbe998 glsl: Wip storage atomic ops 2021-07-22 21:51:36 -04:00
ameerj
df793fc049 glsl: Implement FCMP 2021-07-22 21:51:36 -04:00
ameerj
cdde730219 glsl: Add a more robust fp formatter 2021-07-22 21:51:36 -04:00
ameerj
ac7b0ebcb7 glsl: More FP fixes 2021-07-22 21:51:36 -04:00
ameerj
3064bde415 glsl: FP function fixes 2021-07-22 21:51:36 -04:00
ameerj
65c6f73e43 glsl: More FP instructions/fixes 2021-07-22 21:51:36 -04:00
ameerj
5e9095ef22 glsl: Add many FP32/64 instructions 2021-07-22 21:51:36 -04:00
ReinUsesLisp
53667ddd4e glsl: Fixup build issues 2021-07-22 21:51:36 -04:00
ameerj
ef7bd53f18 glsl: Implement more Integer ops 2021-07-22 21:51:36 -04:00
ameerj
266a3d60e3 glsl: Implement BF* 2021-07-22 21:51:36 -04:00
ameerj
0f40b0e61c glsl: Implement a few Integer instructions 2021-07-22 21:51:36 -04:00
ameerj
fb75d122a2 glsl: Use std::string_view for Emit function args. 2021-07-22 21:51:35 -04:00
ameerj
115c162b9a glsl: Pass IR::Inst& to Emit functions 2021-07-22 21:51:35 -04:00
ameerj
78f5eb90d7 glsl: INeg and IAdd negate tests 2021-07-22 21:51:35 -04:00
ameerj
e221baccdd glsl: Reusable typed variables. IADD32 2021-07-22 21:51:35 -04:00
ameerj
faf4cd72c5 glsl: Fix program linking and cbuf 2021-07-22 21:51:35 -04:00
ameerj
64337f004d glsl: Fix "reg" allocing
based on glasm with some tweaks
2021-07-22 21:51:35 -04:00
ameerj
eaff1030de glsl: Initial backend 2021-07-22 21:51:35 -04:00
ReinUsesLisp
3d822faea1 spirv: Reduce log severity of mismatching denorm rules 2021-07-22 21:51:35 -04:00
ReinUsesLisp
7ac55c2a75 shader: Fix loop safety to SSA pass 2021-07-22 21:51:35 -04:00
ReinUsesLisp
8fb2048934 vk_rasterizer: Exit render passes on fragment barriers 2021-07-22 21:51:35 -04:00
Rodrigo Locatti
dbf7cb9f90 vk_graphics_pipeline: Fix path with no VK_EXT_extended_dynamic_state 2021-07-22 21:51:35 -04:00
ReinUsesLisp
94e751f415 buffer_cache: Invalidate fast buffers on compute 2021-07-22 21:51:35 -04:00
ReinUsesLisp
61cd7dd301 shader: Add logging 2021-07-22 21:51:35 -04:00
lat9nq
373f75d944 shader: Add shader loop safety check settings
Also add a setting for enable Nsight Aftermath.
2021-07-22 21:51:35 -04:00
ReinUsesLisp
487057b8d2 shader: Comment why the array component is not read in TMML 2021-07-22 21:51:35 -04:00
ReinUsesLisp
ba3bdf1d41 vulkan_device: Enable VK_EXT_vertex_input_dynamic_state 2021-07-22 21:51:35 -04:00
ReinUsesLisp
41cca8b8ad vk_pipeline_cache: Skip cached pipelines with different dynamic state 2021-07-22 21:51:35 -04:00
ameerj
5445799260 main: Fix Open Transferable Shader Cache context item
Opens the new shader cache directory location for the specified title, if it exists.
2021-07-22 21:51:35 -04:00
ameerj
3c125d4134 tmml: Remove index component from coords vec
The lod query functions exposed by the rendering API's do not make use of the texturearray layer indexing.
2021-07-22 21:51:35 -04:00
ReinUsesLisp
ea038d6653 vulkan: Add VK_EXT_vertex_input_dynamic_state support
Reduces the number of total pipelines generated on Vulkan.
Tested on Super Smash Bros. Ultimate.
2021-07-22 21:51:35 -04:00
ReinUsesLisp
cb78a1b494 shader: Reorder shader cache directories 2021-07-22 21:51:35 -04:00
ReinUsesLisp
3025b2f605 vk_rasterizer: Implement first index 2021-07-22 21:51:35 -04:00
ReinUsesLisp
d554778311 vulkan: Use VK_EXT_provoking_vertex when available 2021-07-22 21:51:35 -04:00
ameerj
d52bacf6f0 spirv/convert: Catch more signed operations oversights
The sign bit on integers of size < 32 was not properly preserved in casts
2021-07-22 21:51:35 -04:00
ReinUsesLisp
8554a644df spirv/convert: Catch more broken signed operations on Nvidia OpenGL
BitCast U32 to S32 before converting to float on drivers with broken
signed operations.
2021-07-22 21:51:35 -04:00
ameerj
cd8427367e gl_buffer_cache: Use unorm internal formats for snorm texture buffer views
Fixes black textures in UE4 games
2021-07-22 21:51:35 -04:00
ReinUsesLisp
5befc0bf87 shader_environment: Fix local memory size calculations 2021-07-22 21:51:35 -04:00
ReinUsesLisp
60a96c49e5 buffer_cache: Fix copy based uniform bindings tracking 2021-07-22 21:51:35 -04:00
ameerj
15bdd27cac shader_environment: Add shader_local_memory_crs_size to local memory size
Fixes DOOM 2016 missing local memory
2021-07-22 21:51:35 -04:00
ReinUsesLisp
7eaa74ad23 gl_texture_cache: Create image storage views
Fixes SULD.D tests.
2021-07-22 21:51:35 -04:00
ReinUsesLisp
b1ed64ac18 gl_shader_util: Move shader utility code to a separate file 2021-07-22 21:51:35 -04:00
ReinUsesLisp
12fe7210d2 gl_shader_cache: Store workers in shader cache object 2021-07-22 21:51:35 -04:00
ReinUsesLisp
cffd4716c5 vk_pipeline_cache,shader_notify: Add shader notifications 2021-07-22 21:51:35 -04:00
ReinUsesLisp
48aad8dc05 vk_pipeline_cache: Add asynchronous shaders 2021-07-22 21:51:35 -04:00
ReinUsesLisp
2a0aeaa3d2 vk_rasterizer: Flush work on clear and dispatches 2021-07-22 21:51:34 -04:00
FernandoS27
c736b9ffab DMA: Restrict optimised path for BlockToLinear further. 2021-07-22 21:51:34 -04:00
ReinUsesLisp
f45f7b5c2a vk_swapchain: Handle outdated swapchains
Fixes pixelated presentation on Intel devices.
2021-07-22 21:51:34 -04:00
FernandoS27
562af30181 shader: Fix VertexA Shaders. 2021-07-22 21:51:34 -04:00
ReinUsesLisp
ec9a78885e shader: Add 2D and 3D variants to SUATOM and SURED
Used by Claybook.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
b02c78b276 vk_buffer_cache: Handle null texture buffers
Fixes a crash on Age of Calamity cutscenes.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
8f099af6a8 nsight_aftermath_tracker: Fix SPIR-V module writes 2021-07-22 21:51:34 -04:00
ReinUsesLisp
8c954fcaee vk_pipeline_cache: Set support_derivative_control to true 2021-07-22 21:51:34 -04:00
ReinUsesLisp
4f8b68fb04 shader: Avoid CPU side undefined behavior on I2F 2021-07-22 21:51:34 -04:00
ReinUsesLisp
79f2fe1a39 glasm: Use ARB_derivative_control conditionally 2021-07-22 21:51:34 -04:00
ReinUsesLisp
4a2361a1e2 buffer_cache: Reduce uniform buffer size from shader usage
Increases performance significantly on certain titles.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
e57ee3b7fd transform_feedback: Read buffer stride from index instead of layout 2021-07-22 21:51:34 -04:00
ReinUsesLisp
46bd362d0d fixed_pipeline_state: Use regular for loop instead of ranges for perf
MSVC generates better code for it.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
d26271b014 vk_swapchain: Avoid recreating the swapchain on each frame
Recreate only when requested (or sRGB is changed) instead of tracking
the frontend's size. That size is still used as a hint.
2021-07-22 21:51:34 -04:00
lat9nq
22f0c4f002 emit_glasm_context_get_set: Remove unused variable 2021-07-22 21:51:34 -04:00
ReinUsesLisp
5539b13c5a shader,glasm: Implement legacy texcoord loads 2021-07-22 21:51:34 -04:00
ReinUsesLisp
cf9f88e5a7 glasm: Implement legacy varyings 2021-07-22 21:51:34 -04:00
ReinUsesLisp
ac0f5d2ab6 shader: Track legacy varyings 2021-07-22 21:51:34 -04:00
ReinUsesLisp
05d41fa9b7 shader: Add support for "negative" and unaligned offsets
"Negative" offsets don't exist. They are shown as such due to a bug in
nvdisasm.

Unaligned offsets have been proved to read the aligned offset. For
example, when reading an U32, if the offset is 6, the offset read will
be 4.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
5d170de0b5 shader: Implement ISCADD32I 2021-07-22 21:51:34 -04:00
ReinUsesLisp
adc43297c5 spirv: Fix output generics with components 2021-07-22 21:51:34 -04:00
ReinUsesLisp
1148a4eac7 vulkan: Conditionally use shaderInt16
Add support for Polaris AMD devices.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
77372443c3 vulkan: Enable depth bounds and use it conditionally
Intel devices pre-Xe don't support this.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
c44b16124f vk_buffer_cache: Add transform feedback usage to buffers 2021-07-22 21:51:34 -04:00
ReinUsesLisp
916ca74324 opengl: Declare fragment outputs even if they are not used
Fixes Ori and the Blind Forest's menu on GLASM. For some reason
(probably high level optimizations) it is not sanitized on SPIR-V for
OpenGL. Vulkan is unaffected by this change.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
a7e9756671 buffer_cache: Mark uniform buffers as dirty if any enable bit changes 2021-07-22 21:51:34 -04:00
ReinUsesLisp
329dea217d shader: Always initialize up reference in structure control flow
Fixes ubsan issue.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
99f2c31b64 vulkan_device: Enable float64 and int64 conditionally
Add Intel Xe support.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
d093522fac shader: Fix ImageWrite indexing 2021-07-22 21:51:34 -04:00
ReinUsesLisp
d738ad4d0b spirv: Fix image and image buffer descriptor index usage 2021-07-22 21:51:34 -04:00
ReinUsesLisp
eb8464cb3d glasm: Fix immediate texture coordinate 2021-07-22 21:51:34 -04:00
ReinUsesLisp
457dda69cc shader: Clang-format secondary textures 2021-07-22 21:51:34 -04:00
ReinUsesLisp
627161c38e shader: Fix secondary textures 2021-07-22 21:51:34 -04:00
ameerj
dd39b87b0c shader: Adhere to disk shader cache setting 2021-07-22 21:51:34 -04:00
ReinUsesLisp
b659212dbd shader: Fix TMML queries 2021-07-22 21:51:34 -04:00
ReinUsesLisp
fbf5cdcba0 shader: Fix FSwizzleAdd folding when going through phi nodes 2021-07-22 21:51:34 -04:00
ReinUsesLisp
871c9f1ced shader/exception: Fix compilation errors on gcc 2021-07-22 21:51:34 -04:00
ReinUsesLisp
b6c087496b glasm: Reduce reg allocation leaks from an exception to a log 2021-07-22 21:51:34 -04:00
ReinUsesLisp
56d4a9ebde texture_cache: Reduce invalid image/sampler error severity 2021-07-22 21:51:34 -04:00
ReinUsesLisp
b7764c3a79 shader: Handle host exceptions 2021-07-22 21:51:34 -04:00
ReinUsesLisp
83db7abae6 glasm: Use integer lod for TXQ 2021-07-22 21:51:33 -04:00
ReinUsesLisp
3b595fe8b2 glasm: Prepare XFB from state instead of global registers 2021-07-22 21:51:33 -04:00
ReinUsesLisp
e240a62017 glasm: Fix global memory fallbacks 2021-07-22 21:51:33 -04:00
ReinUsesLisp
8f3043c3cf Revert "glasm: Skip phi moves on undefined instructions"
Causes regressions on Bowser's Fury.
2021-07-22 21:51:33 -04:00
ReinUsesLisp
2aa30353b7 glasm: Remove unintentional '\n' on Undef32 2021-07-22 21:51:33 -04:00
ReinUsesLisp
adb591a757 glasm: Use storage buffers instead of global memory when possible 2021-07-22 21:51:33 -04:00
ReinUsesLisp
f58f79c85d glasm: Implement Y direction 2021-07-22 21:51:33 -04:00
ReinUsesLisp
586c785366 glasm: Skip phi moves on undefined instructions 2021-07-22 21:51:33 -04:00
ReinUsesLisp
b9c8814ea9 glasm: Implement undef instructions 2021-07-22 21:51:33 -04:00
ReinUsesLisp
8763cc1ff7 glasm: Fix global memory callbacks 2021-07-22 21:51:33 -04:00
ReinUsesLisp
a41b2ed391 gl_shader_cache: Add disk shader cache 2021-07-22 21:51:33 -04:00
ReinUsesLisp
a49532c8eb video_core,shader: Clang-format fixes 2021-07-22 21:51:33 -04:00
ReinUsesLisp
eacf18cce9 gl_shader_cache: Rename Program abstractions into Pipeline 2021-07-22 21:51:33 -04:00
ReinUsesLisp
48aafe0961 glasm: Release phi node registers after they are no longer needed 2021-07-22 21:51:33 -04:00
ReinUsesLisp
77ee733c3a glasm: Remove unintentionally committed fmt::prints 2021-07-22 21:51:33 -04:00
ReinUsesLisp
70c9281fbf glasm: Fix INeg32 on negative immediates 2021-07-22 21:51:33 -04:00
ReinUsesLisp
75fd0079db glasm: Remove unnecessary value types 2021-07-22 21:51:33 -04:00
ReinUsesLisp
379b305b4b glasm: Throw when there are register leaks 2021-07-22 21:51:33 -04:00
ReinUsesLisp
ca05a13c62 glasm: Catch more register leaks
Add support for null registers. These are used when an instruction has
no usages.

This comes handy when an instruction is only used for its CC value, with
the caveat of having to invalidate all pseudo-instructions before
defining the instruction itself in the register allocator. This commits
changes this.

Workaround a bug on Nvidia's condition codes conditional execution using
branches.
2021-07-22 21:51:33 -04:00
ReinUsesLisp
9fbfe7d676 glasm: Fix usage counting on phi nodes 2021-07-22 21:51:33 -04:00
ReinUsesLisp
4017928213 gl_shader_cache: Do not flip tessellation on OpenGL 2021-07-22 21:51:33 -04:00
ReinUsesLisp
80884e3270 gl_graphics_program: Fix texture buffer bindings 2021-07-22 21:51:33 -04:00
ReinUsesLisp
c721767bcc glasm: Implement global memory fallbacks 2021-07-22 21:51:33 -04:00
ReinUsesLisp
0794273870 glasm: Implement int64 add and subtract 2021-07-22 21:51:33 -04:00
lat9nq
7fdf0d7d33 emit_glasm_context_get_set: Remove unused variable 2021-07-22 21:51:33 -04:00
ReinUsesLisp
e30d4fa976 glasm: Implement indirect attribute loads 2021-07-22 21:51:33 -04:00
ReinUsesLisp
c8414e686f glasm: Implement image atomics 2021-07-22 21:51:33 -04:00
ReinUsesLisp
3a7ca6a7db glasm: Reorder unreachable image atomic insts
Reorder them to the bottom of the file for readability.
2021-07-22 21:51:33 -04:00
ReinUsesLisp
e565eb361a glasm: Implement gl_Layer stores 2021-07-22 21:51:33 -04:00
ReinUsesLisp
89e341d56a glasm: Implement SampleId 2021-07-22 21:51:33 -04:00
ReinUsesLisp
77d8c44b68 glasm: Implement IsHelperInvocation 2021-07-22 21:51:33 -04:00
ReinUsesLisp
ddf601919f glasm: Fix EmitVertex's optimization 2021-07-22 21:51:33 -04:00
ReinUsesLisp
1bccb43cbe gl_shader_cache: Conditionally use viewport mask 2021-07-22 21:51:33 -04:00
ReinUsesLisp
c31521512f gl_shader_cache,glasm: Conditionally use typeless image reads extension 2021-07-22 21:51:33 -04:00
ReinUsesLisp
df406246d9 gl_shader_cache: Improve GLASM error print logic 2021-07-22 21:51:33 -04:00
ReinUsesLisp
84feabac88 glasm: Implement forced early Z 2021-07-22 21:51:33 -04:00
ReinUsesLisp
6bc54e12a0 glasm: Set transform feedback state 2021-07-22 21:51:33 -04:00
ReinUsesLisp
69b910e9e7 video_core: Abstract transform feedback translation utility 2021-07-22 21:51:33 -04:00
ReinUsesLisp
7dadb2bef3 glasm: Simplify patch reads 2021-07-22 21:51:33 -04:00
ReinUsesLisp
b382f57b28 glasm: Fix output patch reads
With this, Luigi's Mansion's sand renders properly.
2021-07-22 21:51:33 -04:00
ReinUsesLisp
c07cc9d6a5 gl_shader_cache: Pass shader runtime information 2021-07-22 21:51:33 -04:00
ReinUsesLisp
9e7b6622c2 shader: Split profile and runtime information in separate structs 2021-07-22 21:51:33 -04:00
ameerj
eb15667905 emit_glasm_context_get_and_set.cpp: Add missing semicolons 2021-07-22 21:51:33 -04:00
ReinUsesLisp
781a87175c glasm: Fix patch attribute declarations 2021-07-22 21:51:33 -04:00
ameerj
36d040da70 glasm: Implement FSWZADD 2021-07-22 21:51:33 -04:00
ReinUsesLisp
3da7b98d37 glasm: Implement PrimitiveId attribute read 2021-07-22 21:51:33 -04:00
ReinUsesLisp
394b96a2fe glasm: Implement clip distance stores 2021-07-22 21:51:32 -04:00
ReinUsesLisp
a5d978e91e glasm: Fix tessellation input attributes 2021-07-22 21:51:32 -04:00
ReinUsesLisp
0d7d85c81e glasm: Add missing semicolon on tesscoord reading 2021-07-22 21:51:32 -04:00
ReinUsesLisp
48d4e26326 glasm: Fix tessellation headers 2021-07-22 21:51:32 -04:00
ReinUsesLisp
9ec2303ad6 glasm: Add tessellation shader declarations 2021-07-22 21:51:32 -04:00
ReinUsesLisp
2913ca811e glasm: Implement TessellationEvaluationPoint 2021-07-22 21:51:32 -04:00
ReinUsesLisp
54decced92 gl_shader_manager: Zero initialize current assembly programs 2021-07-22 21:51:32 -04:00
ReinUsesLisp
c0e4074721 gl_shader_manager: Remove unintentionally committed #pragma 2021-07-22 21:51:32 -04:00
ReinUsesLisp
a569ac418e glasm: Implement patch memory 2021-07-22 21:51:32 -04:00
ReinUsesLisp
164b8c1ec5 glasm: Fix InvocationId declaration 2021-07-22 21:51:32 -04:00
ReinUsesLisp
d5db96386d glasm: Implement InvocationId 2021-07-22 21:51:32 -04:00
ReinUsesLisp
679e7146a7 glasm: Optimize EmitVertex into EMIT 2021-07-22 21:51:32 -04:00
ReinUsesLisp
79929be833 glasm: Implement geometry shader attribute reads 2021-07-22 21:51:32 -04:00
ReinUsesLisp
83cef0426b glasm: Properly declare attributes on geometry programs 2021-07-22 21:51:32 -04:00
ReinUsesLisp
fad139a3e6 glasm: Declare geometry program headers 2021-07-22 21:51:32 -04:00
ReinUsesLisp
690b1841e6 renderer_opengl: State track compute assembly programs 2021-07-22 21:51:32 -04:00
ReinUsesLisp
c5ca4fe451 renderer_opengl: State track assembly programs 2021-07-22 21:51:32 -04:00
ReinUsesLisp
0a54291c9c glasm: Fix potential aliasing bug on cube array samples 2021-07-22 21:51:32 -04:00
ReinUsesLisp
8fdb00a2b5 glasm: Implement ImageWrite 2021-07-22 21:51:32 -04:00
ReinUsesLisp
dadd192b30 glasm: Implement ImageRead 2021-07-22 21:51:32 -04:00
ReinUsesLisp
3d0ffc6ad0 glasm: Implement EmitVertex and EndPrimitive 2021-07-22 21:51:32 -04:00
ReinUsesLisp
f79cbbf814 glasm: Implement ImageGradient 2021-07-22 21:51:32 -04:00
ReinUsesLisp
291f220be3 glasm: Implement 64-bit shifts 2021-07-22 21:51:32 -04:00
ReinUsesLisp
d957b3a8fe glasm: Implement barriers 2021-07-22 21:51:32 -04:00
ReinUsesLisp
b60b3fa113 glasm: Fix compute stage name 2021-07-22 21:51:32 -04:00
ReinUsesLisp
96962c1d3c glasm: Fix phi instruction types 2021-07-22 21:51:32 -04:00
ReinUsesLisp
91a3c2c1c0 glasm: Implement PREC on relevant instructions 2021-07-22 21:51:32 -04:00
ReinUsesLisp
accad56ee7 glasm: Implement stores to gl_ViewportIndex 2021-07-22 21:51:32 -04:00
ReinUsesLisp
2494dbe183 glasm: Implement gl_PointSize stores 2021-07-22 21:51:32 -04:00
ReinUsesLisp
9415c435fc glasm: Implement gl_PointCoord 2021-07-22 21:51:32 -04:00
ReinUsesLisp
12dcb9fcc2 glasm: Implement ImageQueryLod 2021-07-22 21:51:32 -04:00
ReinUsesLisp
4a22942f45 glasm: Implement ImageFetch 2021-07-22 21:51:32 -04:00
ameerj
3777592ada glasm: Implement IADD.CC 2021-07-22 21:51:32 -04:00
ReinUsesLisp
98ed8ff103 glasm: Implement BFE.CC 2021-07-22 21:51:32 -04:00
ReinUsesLisp
2e0d56da7e glasm: Implement SelectU1 2021-07-22 21:51:32 -04:00
ReinUsesLisp
85fc7e584e HACK: Bind stages before and after bindings
Works around a bug where program parameters are only applied to the
current stage, and this one wasn't bound at the moment.

Affects all SSBO usages on GLASM.
2021-07-22 21:51:32 -04:00
ReinUsesLisp
e8ed904805 glasm: Implement gl_WorkGroupID 2021-07-22 21:51:32 -04:00
ReinUsesLisp
0a42277a4f glasm: Implement TXQ and improve texture info reads 2021-07-22 21:51:32 -04:00
ReinUsesLisp
c560bf99c2 glasm: Implement gl_FrongFacing attribute 2021-07-22 21:51:32 -04:00
ReinUsesLisp
8b7d5912d6 glasm: Support textures used in more than one stage 2021-07-22 21:51:32 -04:00
ReinUsesLisp
3d3ed53511 glasm: Implement textureGather instructions 2021-07-22 21:51:32 -04:00
ReinUsesLisp
0fa421f82f glasm: Implement gl_FragDepth and gl_SampleMask stores 2021-07-22 21:51:32 -04:00
ReinUsesLisp
1ee7f8b943 glasm: Do not alias ConditionRef for now
Immediate condition refs where not handled correctly. Just move the
value for now.
2021-07-22 21:51:32 -04:00
ReinUsesLisp
9bb3e008c9 shader: Read branch conditions from an instruction
Fixes the identity removal pass.
2021-07-22 21:51:32 -04:00
ReinUsesLisp
4bad415bca glasm: Implement InstanceId and VertexId 2021-07-22 21:51:31 -04:00
ReinUsesLisp
afcb140185 glasm: Add missing return value on move assignment 2021-07-22 21:51:31 -04:00
ReinUsesLisp
fb3ba62b3a glasm: Fix aliased bitcasts ref counting 2021-07-22 21:51:31 -04:00
ReinUsesLisp
f1b334b9f9 glasm: Remove unintentional comma on vector insert 2021-07-22 21:51:31 -04:00
ReinUsesLisp
ec6fc5fe78 glasm: Implement TEX and TEXS instructions
Remove lod clamp from texture instructions with lod, as this is not
needed (nor supported).
2021-07-22 21:51:31 -04:00
ReinUsesLisp
c42a6143a5 glasm: Add support for non-2D texture samples 2021-07-22 21:51:31 -04:00
ReinUsesLisp
bee9fb0563 glasm: Reorder unreachable image instructions to the bottom 2021-07-22 21:51:31 -04:00
ReinUsesLisp
e6b4d461d2 glasm: Add support for texture offsets 2021-07-22 21:51:31 -04:00
ReinUsesLisp
bf2949df10 glasm: Improve texture sampling instructions 2021-07-22 21:51:31 -04:00
ReinUsesLisp
db2f0f4108 emit_glasm: Enable ARB_draw_buffers when needed 2021-07-22 21:51:31 -04:00
ReinUsesLisp
3c06293e20 emit_glasm: Add support for reading position attributes 2021-07-22 21:51:31 -04:00
lat9nq
f7a2340205 shader_recompiler: GCC fixes
Fixes members of unnamed union not being accessible, and one function
without a declaration.
2021-07-22 21:51:31 -04:00
ameerj
d4f9c798d6 glasm: Implement rest of shared mem 2021-07-22 21:51:31 -04:00
ReinUsesLisp
258f2dec1b opengl: Initial (broken) support to GLASM shaders 2021-07-22 21:51:31 -04:00
ReinUsesLisp
776ab3ea12 shader: Use a non-trivial dummy to construct ASL node union 2021-07-22 21:51:31 -04:00
ReinUsesLisp
38e7b8c805 emit_spirv: Jump to loop body with local variable
Silence unused variable warning
2021-07-22 21:51:31 -04:00
ReinUsesLisp
464f13fe0b glasm: Implement derivative instructions on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
9fb2ea08e8 glasm: Initial (broken) implementation of TEX on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
1f3446b47e glasm: Implement some graphics instructions on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
31d402ee74 glasm: Add Void type to GLASM values 2021-07-22 21:51:31 -04:00
ReinUsesLisp
3764750339 glasm: Add graphics specific shader declarations to GLASM 2021-07-22 21:51:31 -04:00
ameerj
057dee4856 glasm: Implement local memory for glasm 2021-07-22 21:51:31 -04:00
ReinUsesLisp
ab5dbe7c29 emit_spirv: Add missing block in case 2021-07-22 21:51:31 -04:00
ReinUsesLisp
bf5e48ffe4 glasm: Initial implementation of phi nodes on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
0f88fb5d72 glasm: Write result to scalar on integer comparison instructions 2021-07-22 21:51:31 -04:00
ReinUsesLisp
d4385c34e3 glasm: Declare NV_shader_thread_group when needed 2021-07-22 21:51:31 -04:00
ReinUsesLisp
568d813eea vk_update_descriptor: Properly initialize payload on the update descriptor queue 2021-07-22 21:51:31 -04:00
ReinUsesLisp
d54d7de40e glasm: Rework control flow introducing a syntax list
This commit regresses VertexA shaders, their transformation pass has to
be adapted to the new control flow.
2021-07-22 21:51:31 -04:00
ameerj
7ff5851608 glasm: Implement Storage atomics
StorageAtomicExchangeU64 is failing test seemingly due to failure storing 64-bit
result into the register
2021-07-22 21:51:31 -04:00
ReinUsesLisp
8c81a20ace glasm: Ensure reg alloc order across compilers on GLASM
Use a struct constructor to serialize register allocation arguments to
ensure registers are allocated in the same order regardless of the
compiler used.

The A and B functions can be called in any order when passed as
arguments to "foo":

  foo(A(), B())

But the order is guaranteed for curly-braced constructor calls in
classes:

  Foo{A(), B()}

Use this to get consistent behavior.
2021-07-22 21:51:31 -04:00
ReinUsesLisp
c917290497 glasm: Enable unintentionally disabled register aliasing on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
70fbede213 glasm: Review all GLASM insts to be aware of register aliasing 2021-07-22 21:51:31 -04:00
ReinUsesLisp
c4fd6b55bc glasm: Implement shuffle and vote instructions on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
decda4a2c7 glasm: Add MUFU instructions to GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
5b18a12df2 glasm: Implement IAbs64 and INeg64 on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
3b6a632237 shader: Add floating-point rounding to I2F 2021-07-22 21:51:31 -04:00
ReinUsesLisp
3f00a2ad3f glasm: Properly clamp Fp64 on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
deda89372f glasm: Fix register allocation when moving immediate on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
0839e46736 glasm: Implement SelectU64 on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
6237300e36 glasm: Fix clamps so the min value has priority on NAN on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
8eb72ff0dc glasm: Fix moving U64 immediates to registers in GLASM 2021-07-22 21:51:31 -04:00
ameerj
80813b1d14 glasm: Implement storage atomic ops 2021-07-22 21:51:31 -04:00
ReinUsesLisp
ad61b47f80 glasm: Add conversion instructions to GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
7703d65f23 glasm: Add fp min/max insts and fix store for fp64 on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
43a448d98d glasm: Add logical instructions on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp
99352741af glasm: Remove duplicated Fp64 pack instructions on GLASM 2021-07-22 21:51:30 -04:00
ReinUsesLisp
45ef62d3ba glasm: Remove unnecesary new white space on Clamp GLASM 2021-07-22 21:51:30 -04:00
ReinUsesLisp
b4953e79ee glasm: Add floating-point comparisons on GLASM 2021-07-22 21:51:30 -04:00
ameerj
6705f56029 emit_glasm: Implement more integer alu ops 2021-07-22 21:51:30 -04:00
ameerj
3e10709091 glasm: Reimplement bitwise ops and BFI/BFE 2021-07-22 21:51:30 -04:00
ReinUsesLisp
4502595bc2 glasm: Initial GLASM fp64 support 2021-07-22 21:51:30 -04:00
ReinUsesLisp
9f851e3832 glasm: Implement GLASM fp16 packing and move bitwise insns 2021-07-22 21:51:30 -04:00
ReinUsesLisp
4de65fbff4 glasm: Remove unused functions left from rebase 2021-07-22 21:51:30 -04:00
ReinUsesLisp
6358b0d0c1 glasm: Specify namespace when using FormatTo 2021-07-22 21:51:30 -04:00
ReinUsesLisp
939dab7120 glasm: Implement more GLASM composite instructions 2021-07-22 21:51:30 -04:00
ReinUsesLisp
01e18581b9 vk_pipeline_cache: Enable int8 and int16 types on Vulkan 2021-07-22 21:51:30 -04:00
ReinUsesLisp
1c9307969c glasm: Make GLASM aware of types 2021-07-22 21:51:30 -04:00
ameerj
934d300246 glasm: Use CMP.S for Select32
also fixes ADD and SUB to use U modifier
2021-07-22 21:51:30 -04:00
ameerj
68cc445b8e glasm: Implement more logical ops 2021-07-22 21:51:30 -04:00
ameerj
941c6dc740 glasm: Implement BFI, BFE
Along with implementations of common instructions along the way
2021-07-22 21:51:30 -04:00
ReinUsesLisp
3e841f6441 glasm: Use BitField instead of C bitfields 2021-07-22 21:51:30 -04:00
ReinUsesLisp
2b04b4d27f glasm: Remove unused argument in identity instructions on GLASM 2021-07-22 21:51:30 -04:00
ReinUsesLisp
dc02cb92e4 gl_rasterizer: Flush L2 caches before glFlush on GLASM 2021-07-22 21:51:30 -04:00
ReinUsesLisp
2c81ad8311 glasm: Initial GLASM compute implementation for testing 2021-07-22 21:51:30 -04:00
ReinUsesLisp
6fd190d1ae glasm: Implement basic GLASM instructions 2021-07-22 21:51:30 -04:00
ReinUsesLisp
c1ba685d9c glasm: Changes to GLASM register allocator and emit context 2021-07-22 21:51:30 -04:00
ReinUsesLisp
36f1586267 vk_scheduler: Use locks instead of SPSC a queue
This tries to fix a data race where we'd wait forever for the GPU.
2021-07-22 21:51:30 -04:00
ReinUsesLisp
56c47951c5 vk_query_cache: Wait before reading queries 2021-07-22 21:51:30 -04:00
ReinUsesLisp
a515036604 vk_master_semaphore: Use fetch_add to increase master semaphore tick 2021-07-22 21:51:30 -04:00
ReinUsesLisp
b10cf64c48 glasm: Add GLASM backend infrastructure 2021-07-22 21:51:30 -04:00
ameerj
09dc23f971 shader: ISET.X implementation 2021-07-22 21:51:30 -04:00
ReinUsesLisp
bfa47539f6 gl_shader_cache: Remove code unintentionally committed 2021-07-22 21:51:30 -04:00
ReinUsesLisp
b725db8709 shader: Fixup SPIR-V emit header namespaces 2021-07-22 21:51:30 -04:00
ReinUsesLisp
bed090807a Move SPIR-V emission functions to their own header 2021-07-22 21:51:30 -04:00
FernandoS27
ee61ec2c39 shader: Optimize NVN Fallthrough 2021-07-22 21:51:30 -04:00
FernandoS27
153a77efee shader: Stub SR_AFFINITY 2021-07-22 21:51:30 -04:00
ameerj
7ecc6de56a shader: Implement Int32 SUATOM/SURED 2021-07-22 21:51:30 -04:00
ReinUsesLisp
d621e96d0d shader: Initial OpenGL implementation 2021-07-22 21:51:30 -04:00
ReinUsesLisp
850b08a16c spirv: Be aware of NAN unaware drivers 2021-07-22 21:51:30 -04:00
ReinUsesLisp
fde47152d9 spirv: Add SSBO read fallbacks when no aliasing is available 2021-07-22 21:51:29 -04:00
ReinUsesLisp
fd913bceaf spirv: Add OpKill fallback to demote 2021-07-22 21:51:29 -04:00
ReinUsesLisp
d2a0f9d7ad spirv: Do not enable ShaderLayer
This is enabled by an extension instead of the capability.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
2b434b74af spirv: Enable DemoteToHelperInvocationEXT only when supported 2021-07-22 21:51:29 -04:00
ReinUsesLisp
cfd873275d spirv: Use OriginLowerLeft when requested 2021-07-22 21:51:29 -04:00
ReinUsesLisp
bafe9e35a9 spirv: Only add image operands mask when needed 2021-07-22 21:51:29 -04:00
ReinUsesLisp
d2e811db2e spirv: Workaround image unsigned offset bug
Workaround bug on Nvidia's OpenGL SPIR-V compiler when using unsigned
texture offsets.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
4ead714910 spirv: Add int8 and int16 capabilities only when supported 2021-07-22 21:51:29 -04:00
ReinUsesLisp
33bebc3412 spirv: Add integer clamping workarounds
Workaround more bugs on Nvidia's OpenGL SPIR-V compiler.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
7b03b97118 spirv: Implement int8 and int16 conversion fallbacks 2021-07-22 21:51:29 -04:00
ReinUsesLisp
48a17298d7 spirv: Support OpenGL uniform buffers and change bindings 2021-07-22 21:51:29 -04:00
ReinUsesLisp
d5d6778ba5 spirv: Desambiguate descriptor names
Worksaround a bug on Nvidia's OpenGL SPIR-V compiler where names are
used for name matching.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
a46d91b1ef shader: Add OpenGL shader profile options 2021-07-22 21:51:29 -04:00
ReinUsesLisp
028f0033bd shader: Remove shader util 2021-07-22 21:51:29 -04:00
FernandoS27
c49d56c931 shader: Address feedback 2021-07-22 21:51:29 -04:00
FernandoS27
b541f5e5e3 shader: Implement VertexA stage 2021-07-22 21:51:29 -04:00
FernandoS27
da936d6ad8 shader: Implement delegation of Exit to dispatcher on CFG 2021-07-22 21:51:29 -04:00
ReinUsesLisp
f4b82b8dd7 vk_graphics_pipeline: Fix texture buffer descriptors 2021-07-22 21:51:29 -04:00
ameerj
fb14820c86 shader: Fix IADD3.CC 2021-07-22 21:51:29 -04:00
ReinUsesLisp
53acdda772 vk_scheduler: Allow command submission on worker thread
This changes how Scheduler::Flush works. It queues the current command
buffer to be sent to the GPU but does not do it immediately. The Vulkan
worker thread takes care of that. Users will have to use
Scheduler::Flush + Scheduler::WaitWorker to get the previous behavior.

Scheduler::Finish is unchanged.

To avoid waiting on work never queued, Scheduler::Wait sends the current
command buffer if that's what the caller wants to wait.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
c5425b38c1 vk_compute_pass: Fix -Wshadow warning 2021-07-22 21:51:29 -04:00
ReinUsesLisp
025b20f96a shader: Move pipeline cache logic to separate files
Move code to separate files to be able to reuse it from OpenGL. This
greatly simplifies the pipeline cache logic on Vulkan.

Transform feedback state is not yet abstracted and it's still
intrusively stored inside vk_pipeline_cache. It will be moved when
needed on OpenGL.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
ac8835659e vulkan: Defer descriptor set work to the Vulkan thread
Move descriptor lookup and update code to a separate thread. Delaying
this removes work from the main GPU thread and allows creating
descriptor layouts on another thread. This reduces a bit the workload
of the main thread when new pipelines are encountered.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
2f3c3dfc10 vulkan: Rework descriptor allocation algorithm
Create multiple descriptor pools on demand. There are some degrees of
freedom what is considered a compatible pool to avoid wasting large
pools on small descriptors.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
5ed871398b vk_graphics_pipeline: Generate specialized pipeline config functions and improve code 2021-07-22 21:51:29 -04:00
ReinUsesLisp
f4ace63957 shader: Accelerate pipeline transitions and use dirty flags for shaders 2021-07-22 21:51:29 -04:00
ameerj
20e86fd615 shader: Fix BFE s32 undefined check
Our unit tests were hitting this exception.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
8fda599a31 vk_compute_pipeline: Fix index comparison oversight on compute texture buffers 2021-07-22 21:51:29 -04:00
ReinUsesLisp
50eb03382e shader: Fix error checking in bitfieldExtract and implement bitfieldInsert folding 2021-07-22 21:51:29 -04:00
ReinUsesLisp
0c0ee9d897 vulkan_device: Require shaderClipDistance and shaderCullDistance features 2021-07-22 21:51:29 -04:00
ReinUsesLisp
5b1b06f11e vk_graphics_pipeline: Guard against non-tessellation pipelines using patches 2021-07-22 21:51:29 -04:00
ReinUsesLisp
57464e3b72 shader: Fix storage type when reading patches on tess control 2021-07-22 21:51:29 -04:00
ReinUsesLisp
d2b54c6e42 shader: Fix VMNMX selector B 2021-07-22 21:51:29 -04:00
Rodrigo Locatti
2dc86372c7 shader: Fix bugs and build issues on GCC 2021-07-22 21:51:29 -04:00
ReinUsesLisp
7a1f296cda shader: Fix render targets with null attachments 2021-07-22 21:51:29 -04:00
ReinUsesLisp
155be4a8d3 shader: Increase the maximum number of storage buffers
Compute shaders spill uniform buffers on storage buffers, increasing the
expected number.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
fe25f42403 shader: Remove identity removal pass for better build times 2021-07-22 21:51:29 -04:00
ReinUsesLisp
0c7230a606 shader: Add more strict validation the pass 2021-07-22 21:51:29 -04:00
ReinUsesLisp
25949b864c shader: Fix forward referencing identity instructions when inserting phi 2021-07-22 21:51:29 -04:00
ReinUsesLisp
92a01984e6 shader: Remove invalidated blocks in dead code elimination pass 2021-07-22 21:51:29 -04:00
ReinUsesLisp
aece958c2b shader: Add missing UndoUse case for GetSparseFromOp 2021-07-22 21:51:29 -04:00
ReinUsesLisp
0ace34575c shader: Require dual source blending 2021-07-22 21:51:29 -04:00
ReinUsesLisp
21e3382830 shader: Simplify code in opcodes.h to fix Intellisense
Avoid using std::array to fix Intellisense not properly compiling this
code and disabling itself on all files that include it.

While we are at it, change the code to use u8 instead of size_t for the
number of instructions in an opcode.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
d10cf55353 shader: Implement indexed textures 2021-07-22 21:51:28 -04:00
ameerj
7a9dc78398 shader: Refactor atomic_operations_global_memory 2021-07-22 21:51:28 -04:00
ameerj
427951d6fe shader: add missing include guard in half_floating_point_helper.h 2021-07-22 21:51:28 -04:00
ReinUsesLisp
c8f9772d65 shader: Fix gcc warnings 2021-07-22 21:51:28 -04:00
ReinUsesLisp
75dee55486 shader: Inline common Value getters 2021-07-22 21:51:28 -04:00
ReinUsesLisp
23182fa59c shader: Intrusively store in a block if it's sealed or not 2021-07-22 21:51:28 -04:00
ReinUsesLisp
eed6da55b8 cmake: Link to common in shader_recompiler 2021-07-22 21:51:28 -04:00
ReinUsesLisp
cc0fcd1b8d shader: Improve goto removal algorithm complexity
Find sibling node containing a nephew searching from the nephew itself
instead of the uncle.
2021-07-22 21:51:28 -04:00
ReinUsesLisp
f66851e376 shader: Use memset to reset instruction arguments 2021-07-22 21:51:28 -04:00
ReinUsesLisp
c84bbd9e44 shader: Inline common Value functions into the header 2021-07-22 21:51:28 -04:00
ReinUsesLisp
050e81500c shader: Move microinstruction header to the value header 2021-07-22 21:51:28 -04:00
ReinUsesLisp
e4d1122082 shader: Move siblings check to a separate function and comment them out 2021-07-22 21:51:28 -04:00
ReinUsesLisp
4209828646 shader: Intrusively store register values in block for SSA pass 2021-07-22 21:51:28 -04:00
ReinUsesLisp
6944cabb89 shader: Inline common Opcode and Inst functions 2021-07-22 21:51:28 -04:00
ReinUsesLisp
4bbe530337 shader: Inline common IR::Block methods 2021-07-22 21:51:28 -04:00
ReinUsesLisp
24cc298660 shader: Use a small_vector for phi blocks 2021-07-22 21:51:28 -04:00
ReinUsesLisp
79c2e43fcd shader: Calculate number of arguments in an opcode at compile time 2021-07-22 21:51:28 -04:00
ReinUsesLisp
dd860b684c shader: Implement D3D samplers 2021-07-22 21:51:28 -04:00
ReinUsesLisp
a8d46a5eae shader: Add constant propagation for arithmetic right shifts 2021-07-22 21:51:28 -04:00
ReinUsesLisp
469f8bb857 shader: Simplify code for local memory 2021-07-22 21:51:28 -04:00
ReinUsesLisp
7018e524f5 shader: Add NVN storage buffer fallbacks
When we can't track the SSBO origin of a global memory instruction,
leave it as a global memory operation and assume these pointers are in
the NVN storage buffer slots, then apply a linear search in the shader's
runtime.
2021-07-22 21:51:28 -04:00
ReinUsesLisp
6325601947 spirv: Fix ViewportMask 2021-07-22 21:51:28 -04:00
ameerj
5b8afed871 spirv: Replace Constant/ConstantComposite with Const helper 2021-07-22 21:51:28 -04:00
FernandoS27
2999028976 shader: Address feedback 2021-07-22 21:51:28 -04:00
FernandoS27
881b33da3b shader: Implement F2F (Imm) 2021-07-22 21:51:28 -04:00
FernandoS27
21a878237b shader: Implement IADD3.CC/.X 2021-07-22 21:51:28 -04:00
FernandoS27
f69d0b91ff shader: Address feedback 2021-07-22 21:51:28 -04:00
FernandoS27
080857b60e shader: Add coarse derivatives 2021-07-22 21:51:28 -04:00
FernandoS27
04c459fc8d shader: Implement fine derivates constant propagation 2021-07-22 21:51:28 -04:00
FernandoS27
f18a6dd1bd shader: Implement SR_Y_DIRECTION 2021-07-22 21:51:28 -04:00
ReinUsesLisp
50f8007172 shader: Fix Phi node types 2021-07-22 21:51:28 -04:00
ReinUsesLisp
0a0818c025 shader: Fix memory barriers 2021-07-22 21:51:28 -04:00
ReinUsesLisp
c9e4609d87 spirv: Fix implicit lod type 2021-07-22 21:51:28 -04:00
ReinUsesLisp
7cfa403683 spirv: Use explicit lods outside of fragment shaders 2021-07-22 21:51:28 -04:00
ReinUsesLisp
dbbd4b5496 spirv: Use ConstOffset instead of Offset when possible 2021-07-22 21:51:28 -04:00
ameerj
be431f5ed0 shader: Implement BFE and BFI CC
Fix two bugs in BFI.
2021-07-22 21:51:28 -04:00
ReinUsesLisp
80940b1706 shader: Implement SampleMask 2021-07-22 21:51:28 -04:00
ReinUsesLisp
95815a3883 shader: Implement PIXLD.MY_INDEX 2021-07-22 21:51:28 -04:00
ReinUsesLisp
f3473c5143 spirv: Bitcast non-F32 output attributes to their type before store 2021-07-22 21:51:28 -04:00
ReinUsesLisp
e3514bcd6b spirv: Implement ViewportMask with NV_viewport_array2 2021-07-22 21:51:28 -04:00
ReinUsesLisp
4657cf78fd spirv: Bitcast non-F32 attributes to F32 2021-07-22 21:51:27 -04:00
ReinUsesLisp
b0f1255c8c shader: Implement PrimitiveId 2021-07-22 21:51:27 -04:00
ReinUsesLisp
183855e396 shader: Implement tessellation shaders, polygon mode and invocation id 2021-07-22 21:51:27 -04:00
ReinUsesLisp
34519d3fc6 shader: Mark atomic instructions as writes 2021-07-22 21:51:27 -04:00
lat9nq
7ae3ea6bee vk_pipeline_cache: Silence GCC warnings
Silences `-Werror=missing-field-initializers` due to missing
initializers.
2021-07-22 21:51:27 -04:00
ReinUsesLisp
416e1b7441 spirv: Implement image buffers 2021-07-22 21:51:27 -04:00
ReinUsesLisp
d8ec99dada spirv: Implement Layer stores 2021-07-22 21:51:27 -04:00
FernandoS27
ab3831f6cb spirv: Fix alpha test 2021-07-22 21:51:27 -04:00
ameerj
6f4a1c8dcf spirv: Fix non-atomic 64-bit store 2021-07-22 21:51:27 -04:00
ameerj
6c512f4bff spirv: Implement alpha test 2021-07-22 21:51:27 -04:00
ReinUsesLisp
b126987c59 shader: Implement transform feedbacks and define file format 2021-07-22 21:51:27 -04:00
ReinUsesLisp
a83579b50a shader: Implement early Z tests 2021-07-22 21:51:27 -04:00
ReinUsesLisp
09165ae189 shader: Document and relax cache control on surface instructions 2021-07-22 21:51:27 -04:00
ReinUsesLisp
fa75b9b062 spirv: Rework storage buffers and shader memory 2021-07-22 21:51:27 -04:00
ReinUsesLisp
c070991def shader: Fix fixed pipeline point size on geometry shaders 2021-07-22 21:51:27 -04:00
ReinUsesLisp
2597cee85b shader: Add constant propagation for *&^| binary operations 2021-07-22 21:51:27 -04:00
ReinUsesLisp
f263760c5a shader: Implement geometry shaders 2021-07-22 21:51:27 -04:00
ReinUsesLisp
a6cef71cc0 shader: Implement OUT 2021-07-22 21:51:27 -04:00
lat9nq
dd3432d357 internal_stage_buffer_entry_read: Remove pragma optimize off 2021-07-22 21:51:27 -04:00
ReinUsesLisp
4b0172f6de shader: Stub SR_INVOCATION_INFO 2021-07-22 21:51:27 -04:00
ReinUsesLisp
f712084147 shader: Stub ISBERD 2021-07-22 21:51:27 -04:00
ReinUsesLisp
2516829e4c shader: Fix CC in I2I 2021-07-22 21:51:27 -04:00
ReinUsesLisp
23b8714732 spirv: Define StorageImageWriteWithoutFormat capability when used 2021-07-22 21:51:27 -04:00
ReinUsesLisp
a33014022e pipeline_helper: Simplify descriptor objects initialization 2021-07-22 21:51:27 -04:00
ReinUsesLisp
415c7e46ed shader: Simplify FLO and throw on CC 2021-07-22 21:51:27 -04:00
ReinUsesLisp
dfd5341d71 shader: Mark blocks with no end branch as unreachable 2021-07-22 21:51:27 -04:00
ReinUsesLisp
2ed80f6b1e shader: Implement LOP CC 2021-07-22 21:51:27 -04:00
ReinUsesLisp
5c61e860e4 shader: Implement SR_THREAD_KILL 2021-07-22 21:51:27 -04:00
ReinUsesLisp
c9337a4ae4 shader: Apply sign bit in FCMP (imm) 2021-07-22 21:51:27 -04:00
ameerj
3db2b3effa shader: Implement ATOM/S and RED 2021-07-22 21:51:27 -04:00
ReinUsesLisp
479ca00071 nsight_aftermath_tracker: Report used shaders to Nsight Aftermath 2021-07-22 21:51:27 -04:00
ReinUsesLisp
106764a6d5 spirv: Move phi node patching to a separate function 2021-07-22 21:51:27 -04:00
ReinUsesLisp
ab543f1821 spirv: Guard against typeless image reads on unsupported devices 2021-07-22 21:51:27 -04:00
ReinUsesLisp
9280cd649a shader: Move LaneId to the warp emission file and fix AMD 2021-07-22 21:51:27 -04:00
ReinUsesLisp
1030b612a3 vk_rasterizer: Request outside render pass execution context for compute 2021-07-22 21:51:27 -04:00
ReinUsesLisp
e5e79648cf pipeline_helper: Add missing [[maybe_unused]] 2021-07-22 21:51:27 -04:00
ReinUsesLisp
2e71e4c5c0 spirv: Fix forward declarations on phi nodes 2021-07-22 21:51:27 -04:00
ReinUsesLisp
d404b871d5 shader: Mark ImageWrite with side effects 2021-07-22 21:51:27 -04:00
FernandoS27
1be6705408 shader: Implement CC for ISET, FSET, PSET, CSET, and DSET
Throw when other instructions are missing CC.
2021-07-22 21:51:27 -04:00
ReinUsesLisp
8cea39b5a6 shader: Remove outdated comment in F2I 2021-07-22 21:51:27 -04:00
ReinUsesLisp
7cb2ab3585 shader: Implement SULD and SUST 2021-07-22 21:51:26 -04:00
ReinUsesLisp
094da34456 shader: Fix Windows build issues 2021-07-22 21:51:26 -04:00
lat9nq
5bfcafa0a2 shader: Address feedback + clang format 2021-07-22 21:51:26 -04:00
lat9nq
0bb85f6a75 shader_recompiler,video_core: Cleanup some GCC and Clang errors
Mostly fixing unused *, implicit conversion, braced scalar init,
fpermissive, and some others.

Some Clang errors likely remain in video_core, and std::ranges is still
a pertinent issue in shader_recompiler

shader_recompiler: cmake: Force bracket depth to 1024 on Clang
Increases the maximum fold expression depth

thread_worker: Include condition_variable

Don't use list initializers in control flow

Co-authored-by: ReinUsesLisp <reinuseslisp@airmail.cc>
2021-07-22 21:51:26 -04:00
ReinUsesLisp
5cd3d00167 shader: Fix FCMP immediate variant 2021-07-22 21:51:26 -04:00
ReinUsesLisp
233e39bb7b shader: Fix dangling labels 2021-07-22 21:51:26 -04:00
ReinUsesLisp
e9a91bc5cc shader: Interact texture buffers with buffer cache 2021-07-22 21:51:26 -04:00
ReinUsesLisp
56b92bd89c shader: Fix F2I 2021-07-22 21:51:26 -04:00
ReinUsesLisp
ef88552224 shader: Fix TextureGrad 2021-07-22 21:51:26 -04:00
ReinUsesLisp
1f3eb601ac shader: Implement texture buffers 2021-07-22 21:51:26 -04:00
FernandoS27
dcaf0e9150 shader: Address feedback 2021-07-22 21:51:26 -04:00
FernandoS27
73cb17f41b shader: Implement indexed Position and ClipDistances 2021-07-22 21:51:26 -04:00
FernandoS27
1d51803169 shader: Implement indexed attributes 2021-07-22 21:51:26 -04:00
FernandoS27
0df7e509db shader: Implement AL2P 2021-07-22 21:51:26 -04:00
FernandoS27
20ba0ea0a9 shader: Fix BRX tracking 2021-07-22 21:51:26 -04:00
ReinUsesLisp
bfeeb23ddc vk_pipeline_cache: Fix num of pipeline workers on weird platforms 2021-07-22 21:51:26 -04:00
ReinUsesLisp
417fb5d385 shader: Move recursive SSA rewrite to the heap 2021-07-22 21:51:26 -04:00
FernandoS27
72daa2a039 shader: Fix ShadowCube declaration type, set number of pipeline threads based on hardware 2021-07-22 21:51:26 -04:00
ReinUsesLisp
9e6fe430bd shader: Fix splits on blocks using indirect branches 2021-07-22 21:51:26 -04:00
ReinUsesLisp
ffca21487f shader: Eliminate orphan blocks more efficiently 2021-07-22 21:51:26 -04:00
ReinUsesLisp
da6cf2632c shader: Add subgroup masks 2021-07-22 21:51:26 -04:00
ReinUsesLisp
fc93bc2abd shader: Implement BAR and fix memory barriers 2021-07-22 21:51:26 -04:00
ReinUsesLisp
85795de99f shader: Abstract breadth searches and use the abstraction 2021-07-22 21:51:26 -04:00
ReinUsesLisp
3f594dd86b shader: Reimplement GetCbufU64 as GetCbufU32x2
It may generate better code on some compilers and it's easier to handle.
2021-07-22 21:51:26 -04:00
ReinUsesLisp
5b3c6d59c2 vk_compute_pass: Fix compute passes 2021-07-22 21:51:26 -04:00
ReinUsesLisp
5ed68e83db shader: Remove atomic flags and use mutex + cond variable for pipelines 2021-07-22 21:51:26 -04:00
ReinUsesLisp
0b26f2b90e shader: Remove unused header in VOTE 2021-07-22 21:51:26 -04:00
ReinUsesLisp
6ff2e9ba09 vk_pipeline_cache: Remove unnecesary scope in pipeline cache locking 2021-07-22 21:51:26 -04:00
ReinUsesLisp
9a342f5605 shader: Rework global memory tracking to use breadth-first search 2021-07-22 21:51:26 -04:00
ReinUsesLisp
c4aab5c40e shader: Fix fp16 merge when using native fp16 2021-07-22 21:51:26 -04:00
ReinUsesLisp
ca7ebdc471 shader: Fix FADD32I 2021-07-22 21:51:26 -04:00
FernandoS27
e7700aad18 shader: Fix undetected bug from review 2021-07-22 21:51:26 -04:00
FernandoS27
ed6a1b1a3d shader: Address feedback 2021-07-22 21:51:26 -04:00
FernandoS27
80df541a08 shader: "Implement" NOP 2021-07-22 21:51:26 -04:00
FernandoS27
480dc0d5e6 vk_pipeline_cache: Small fixes to the pipeline cache 2021-07-22 21:51:26 -04:00
FernandoS27
baec84247f shader: Address Feedback 2021-07-22 21:51:26 -04:00
FernandoS27
45d547af11 shader: Implement SR_LaneId 2021-07-22 21:51:26 -04:00
FernandoS27
595806fb1c shader: Fix shared memory on cool drivers 2021-07-22 21:51:26 -04:00
FernandoS27
655f7a570a shader: Implement MEMBAR 2021-07-22 21:51:26 -04:00
FernandoS27
ecb30c9072 shader: Improve VOTE.VTG stub 2021-07-22 21:51:25 -04:00
FernandoS27
12f5f32098 shader: Mark SSBOs as written when they are 2021-07-22 21:51:25 -04:00
FernandoS27
d819ba4489 shader: Implement ViewportIndex 2021-07-22 21:51:25 -04:00
FernandoS27
fd496d0401 shader: Stub TLD4's PTP when it isn't constant 2021-07-22 21:51:25 -04:00
FernandoS27
5ed8f24384 shader: Stub VOTE.VTG 2021-07-22 21:51:25 -04:00
FernandoS27
bee8188799 shader: Fold composite extract 2021-07-22 21:51:25 -04:00
FernandoS27
c3bace756f shader: Fold comparisons and Pack/Unpack16 2021-07-22 21:51:25 -04:00
ReinUsesLisp
b4a5e767d0 shader: Fix branches to visited virtual blocks 2021-07-22 21:51:25 -04:00
ReinUsesLisp
d0a529683a vulkan: Serialize pipelines on a separate thread 2021-07-22 21:51:25 -04:00
ReinUsesLisp
8771639d1e vulkan: Create pipeline layouts in separate threads 2021-07-22 21:51:25 -04:00
ReinUsesLisp
2fc698b040 vulkan: Build pipelines in parallel at runtime
Wait from the worker thread for a pipeline to build before binding it to
the command buffer. This allows queueing pipelines to multiple threads.
2021-07-22 21:51:25 -04:00
ReinUsesLisp
f1dd743731 shader: Fix dependency on identity removal pass 2021-07-22 21:51:25 -04:00
ReinUsesLisp
5f22cd89e2 shader: Fix constant propagation to use reverse post order 2021-07-22 21:51:25 -04:00
ReinUsesLisp
eaafd53cfe shader: Implement LDG .U.128 as .128 2021-07-22 21:51:25 -04:00
ReinUsesLisp
c826220733 shader: Unroll "using enum" for opcode declarations 2021-07-22 21:51:25 -04:00
ReinUsesLisp
0c933e20de vk_pipeline_cache: Name SPIR-V modules 2021-07-22 21:51:25 -04:00
ReinUsesLisp
09e1927b70 spirv: Remove unnecesary variable for clip distances 2021-07-22 21:51:25 -04:00
FernandoS27
0c4cf3b9eb shader: Implement ClipDistance 2021-07-22 21:51:25 -04:00
FernandoS27
67afdaf566 shader: Fix TXD 2021-07-22 21:51:25 -04:00
FernandoS27
4d0d29fc20 shader: Address feedback 2021-07-22 21:51:25 -04:00
ReinUsesLisp
cb6fc03e55 shader: Always pass a lod for TexelFetch 2021-07-22 21:51:25 -04:00
FernandoS27
630273b629 shader: Implement TXD 2021-07-22 21:51:25 -04:00
FernandoS27
d5bfc63088 shader: Implement ImageGradient 2021-07-22 21:51:25 -04:00
FernandoS27
be3e94ae55 shader: Implement TMML partially 2021-07-22 21:51:25 -04:00
FernandoS27
613b48c4a2 shader,spirv: Implement ImageQueryLod. 2021-07-22 21:51:25 -04:00
FernandoS27
2c276ec6eb shader: Implement TLDS 2021-07-22 21:51:25 -04:00
FernandoS27
dc1a9a3bed shader: Implement TLD 2021-07-22 21:51:25 -04:00
ReinUsesLisp
7a1c14269e spirv: Add fixed pipeline point size 2021-07-22 21:51:25 -04:00
FernandoS27
9d7422d967 shader: Add PointCoord attribute 2021-07-22 21:51:25 -04:00
ameerj
b7589fe115 shader: Add PointSize attribute 2021-07-22 21:51:25 -04:00
ReinUsesLisp
514a6b07ee shader: Store type of phi nodes in flags
This is needed because pseudo-instructions where invalidated.
2021-07-22 21:51:25 -04:00
ReinUsesLisp
b0d5572abf shader: Fix indirect branches to scheduler instructions 2021-07-22 21:51:25 -04:00
ReinUsesLisp
55b960a20f spirv: Fix default output attribute initialization 2021-07-22 21:51:25 -04:00
ReinUsesLisp
12783f8105 shader: Add missing new lines 2021-07-22 21:51:25 -04:00
ameerj
6c51f49632 shader: Implement FSWZADD 2021-07-22 21:51:25 -04:00
FernandoS27
34aba9627a shader: Implement BRX 2021-07-22 21:51:25 -04:00
ReinUsesLisp
39a379632e shader: Fix alignment checks on RZ 2021-07-22 21:51:25 -04:00
ameerj
73af0d2e0d shader: Implement I2I CC 2021-07-22 21:51:25 -04:00
ameerj
dbc1e5cde7 shader: Implement I2I SAT 2021-07-22 21:51:25 -04:00
ReinUsesLisp
3c758d9b53 vk_pipeline_cache: Fix size hashing of shaders 2021-07-22 21:51:25 -04:00
ameerj
cd9f75e223 shader: Fix ISCADD logic for PO/CC 2021-07-22 21:51:25 -04:00
ReinUsesLisp
e860870dd2 shader: Implement LDS, STS, LDL, and STS and use SPIR-V 1.4 when available 2021-07-22 21:51:25 -04:00
ameerj
84298ce191 shader: Implement ISCADD CC 2021-07-22 21:51:24 -04:00
ameerj
51475e21ba shader: Implement VMAD, VMNMX, VSETP 2021-07-22 21:51:24 -04:00
ReinUsesLisp
0e1b213fa7 shader: Add missing I2I exception when CC is used 2021-07-22 21:51:24 -04:00
ReinUsesLisp
dbd882ddeb shader: Better interpolation and disabled attributes support 2021-07-22 21:51:24 -04:00
ReinUsesLisp
675a82416d spirv: Remove dependencies on Environment when generating SPIR-V 2021-07-22 21:51:24 -04:00
ReinUsesLisp
cb6039ccea vk_pipeline_cache: Fix pipeline and shader caches 2021-07-22 21:51:24 -04:00
ReinUsesLisp
f0031babeb shader: Implement front face 2021-07-22 21:51:24 -04:00
ReinUsesLisp
a806b29cb9 shader: Fix structured control flow on KIL instructions
This could potentially leave unvisited blocks, leading to illegal phi
nodes.
2021-07-22 21:51:24 -04:00
FernandoS27
cdf0cc3869 shader: Fix TXQ 2021-07-22 21:51:24 -04:00
ReinUsesLisp
ec005be99d shader: Fix rasterizer integration order issues 2021-07-22 21:51:24 -04:00
ReinUsesLisp
17063d16a3 shader: Implement TXQ and fix FragDepth 2021-07-22 21:51:24 -04:00
ReinUsesLisp
d9c5bd9509 shader: Refactor PTP and other minor changes 2021-07-22 21:51:24 -04:00
FernandoS27
b5db38f50e shader: Add IR opcode for ImageFetch 2021-07-22 21:51:24 -04:00
FernandoS27
742d11c2ad shader: Implement TLD4.PTP 2021-07-22 21:51:24 -04:00
FernandoS27
981eb6f43b shader: Fix Array Indices in TEX/TLD4 2021-07-22 21:51:24 -04:00
FernandoS27
f5672777c8 shader: Implement FragDepth 2021-07-22 21:51:24 -04:00
FernandoS27
fda0835300 shader: Implement TLD4S. 2021-07-22 21:51:24 -04:00
FernandoS27
c7c518e280 shader: Implement TLD4 and TLD4_B 2021-07-22 21:51:24 -04:00
ameerj
32c5483beb shader: Implement SHFL 2021-07-22 21:51:24 -04:00
ReinUsesLisp
49e87ea8ab shader: Track first bindless argument instead of the instruction itself 2021-07-22 21:51:24 -04:00
ReinUsesLisp
d3dad6b632 shader: Properly insert Prologue instruction 2021-07-22 21:51:24 -04:00
ReinUsesLisp
83a283fa86 shader: Minor style nits 2021-07-22 21:51:24 -04:00
FernandoS27
8cb9443cb9 shader: Fix F2I 2021-07-22 21:51:24 -04:00
ReinUsesLisp
68a9505d8a shader: Implement NDC [-1, 1], attribute types and default varying initialization 2021-07-22 21:51:24 -04:00
ReinUsesLisp
1d2db78398 shader: Fix use-after-free bug in object_pool 2021-07-22 21:51:24 -04:00
ameerj
3d07cef009 shader: Implement VOTE 2021-07-22 21:51:24 -04:00
ReinUsesLisp
d40faa1db0 vk_pipeline_cache: Fix ReleaseContents order 2021-07-22 21:51:24 -04:00
ReinUsesLisp
a8d8fd40f7 shader: Fix TEX mask 2021-07-22 21:51:24 -04:00
ReinUsesLisp
f8115a6a9e vk_pipeline_cache: Add pipeline cache 2021-07-22 21:51:24 -04:00
ReinUsesLisp
c63cf4fa2e vk_pipeline_cache: Add pipeline cache 2021-07-22 21:51:24 -04:00
ReinUsesLisp
2be5c7eff4 shader: Fold interpolation multiplications 2021-07-22 21:51:24 -04:00
ReinUsesLisp
96b7ced6ec shader: Better but still partial interpolation support 2021-07-22 21:51:24 -04:00
ameerj
e4e1cc11b8 shader: Implement DMNMX, DSET, DSETP 2021-07-22 21:51:24 -04:00
FernandoS27
56be556eee shader: Implement FADD32I 2021-07-22 21:51:24 -04:00
FernandoS27
a62f04efab shader: Implement F2F 2021-07-22 21:51:24 -04:00
ReinUsesLisp
8b3b9c3371 shader: Add missing fp64 usage flags 2021-07-22 21:51:24 -04:00
ameerj
c858b8ba97 shader: Implement DMUL and DFMA
Also add a missing const on DADD
2021-07-22 21:51:24 -04:00
ameerj
112b8f00f0 shader: Add FP64 register load/store helpers 2021-07-22 21:51:24 -04:00
ReinUsesLisp
a77e764726 shader: Add support for fp16 comparisons and misc fixes 2021-07-22 21:51:24 -04:00
FernandoS27
27fb97377e shader: Fix floating point comparison for FP16 2021-07-22 21:51:23 -04:00
FernandoS27
e10d9c1b8e shader: Implement HSETP2 2021-07-22 21:51:23 -04:00
FernandoS27
9e213fd861 shader: Implement HSET2 2021-07-22 21:51:23 -04:00
FernandoS27
ed6cd3c94a shader: Implement HMUL2 2021-07-22 21:51:23 -04:00
FernandoS27
28dff6a629 shader: Implement HFMA2 2021-07-22 21:51:23 -04:00
ReinUsesLisp
76c8a962ac spirv: Implement VertexId and InstanceId, refactor code 2021-07-22 21:51:23 -04:00
FernandoS27
e802512d8e shader: Refactor half floating instructions 2021-07-22 21:51:23 -04:00
ReinUsesLisp
f91859efd2 shader: Implement I2F 2021-07-22 21:51:23 -04:00
ReinUsesLisp
c97d03efb9 shader: Implement ISCADD (imm) 2021-07-22 21:51:23 -04:00
ReinUsesLisp
eeb1efa2d2 shader: Implement LOP32I 2021-07-22 21:51:23 -04:00
ReinUsesLisp
260743f371 shader: Add partial rasterizer integration 2021-07-22 21:51:23 -04:00
ameerj
72990df7ba shader: Implement DADD 2021-07-22 21:51:23 -04:00
ameerj
3b7fd3ad0f shader: Implement CSET and CSETP 2021-07-22 21:51:23 -04:00
ReinUsesLisp
32b6c63485 shader: Reorder phi nodes when redefined as undefined opcodes 2021-07-22 21:51:23 -04:00
ReinUsesLisp
8dd0acfaeb shader: Fix instruction transitions in and out of Phi 2021-07-22 21:51:23 -04:00
ameerj
fa2f6e38f4 shader: Implement FSET and FSETP
Also fix oversight with adding SignedZeroInfNanPreserve execution mode.
2021-07-22 21:51:23 -04:00
ReinUsesLisp
17a82b56d7 shader: Implement TEXS 2021-07-22 21:51:23 -04:00
ReinUsesLisp
71f96fa636 shader: Implement CAL inlining function calls 2021-07-22 21:51:23 -04:00
ameerj
b9f7bf4472 spirv: Add SignedZeroInfNanPreserve logic 2021-07-22 21:51:23 -04:00
ameerj
8d470c2e63 shader: Implement FMNMX
And add a const in FCMP
2021-07-22 21:51:23 -04:00
ReinUsesLisp
2d422b2498 shader: Fix rebase issue 2021-07-22 21:51:23 -04:00
ameerj
ba8c1d2eb4 shader: Implement FCMP
still need to configure some settings for NV denorm flush and intel NaN
2021-07-22 21:51:23 -04:00
ReinUsesLisp
3a63fa0477 shader: Partial implementation of LDC 2021-07-22 21:51:23 -04:00
ReinUsesLisp
ab46371247 shader: Initial support for textures and TEX 2021-07-22 21:51:23 -04:00
ameerj
7d6ba5b984 shader: Implement R2P 2021-07-22 21:51:23 -04:00
ameerj
924f0a9149 shader: Implement SHF 2021-07-22 21:51:23 -04:00
ameerj
5465cb1561 shader: Implement LEA 2021-07-22 21:51:23 -04:00
ReinUsesLisp
d1edc16ba8 shader: Deduplicate HADD2 code 2021-07-22 21:51:23 -04:00
ameerj
81f72471e8 shader: Implement I2I 2021-07-22 21:51:23 -04:00
ReinUsesLisp
4006929c98 shader: Implement HADD2 2021-07-22 21:51:23 -04:00
ameerj
980cafdc27 shader: Implement LOP and LOP3 2021-07-22 21:51:23 -04:00
ameerj
382cba94ed shader: Implement IADD3 2021-07-22 21:51:23 -04:00
ameerj
c2155f04d4 shader: Implement PSETP 2021-07-22 21:51:23 -04:00
ameerj
ce9b116cfe Implement PSET, refactor common comparison funcs 2021-07-22 21:51:23 -04:00
ameerj
103b9da4f7 shader: Implement FLO 2021-07-22 21:51:23 -04:00
ameerj
e038928616 shader: Implement ISET, add common_funcs 2021-07-22 21:51:23 -04:00
ameerj
bec7d3111d shader: Make IMNMX, SHR, SEL stylistically more consistent 2021-07-22 21:51:22 -04:00
ameerj
bce0b1dcca shader: Implement ICMP 2021-07-22 21:51:22 -04:00
ameerj
20390c0548 shader: Implement IMNMX 2021-07-22 21:51:22 -04:00
ameerj
08a9e95905 shader: Implement BFI 2021-07-22 21:51:22 -04:00
ameerj
34ac9b4d7e shader: Implement BFE 2021-07-22 21:51:22 -04:00
ameerj
a8c41c50d3 shader: Implement POPC 2021-07-22 21:51:22 -04:00
ameerj
cc55d28949 shader: Implement SHR 2021-07-22 21:51:22 -04:00
ameerj
8810c88b7e shader: Implement SEL 2021-07-22 21:51:22 -04:00
ReinUsesLisp
726625cf50 spirv: Move phi arguments emit to a separate function 2021-07-22 21:51:22 -04:00
ReinUsesLisp
3bc857f2f3 shader: Avoid infinite recursion when tracking global memory 2021-07-22 21:51:22 -04:00
ReinUsesLisp
622d676202 shader: Fix conditional execution of exit instructions 2021-07-22 21:51:22 -04:00
ReinUsesLisp
7496bbf758 spirv: Add support for self-referencing phi nodes 2021-07-22 21:51:22 -04:00
ReinUsesLisp
e87a502da2 shader: Fix control flow 2021-07-22 21:51:22 -04:00
ReinUsesLisp
9d6a98d950 shader: Implement more of XMAD and FFMA32I and fix XMAD.CBCC 2021-07-22 21:51:22 -04:00
ReinUsesLisp
e44752ddc8 shader: FMUL, select, RRO, and MUFU fixes 2021-07-22 21:51:22 -04:00
ReinUsesLisp
18a766b362 shader: Fix MOV(reg), add SHL variants and emit neg and abs instructions 2021-07-22 21:51:22 -04:00
ReinUsesLisp
274897dfd5 spirv: Fixes and Intel specific workarounds 2021-07-22 21:51:22 -04:00
ReinUsesLisp
704c6f353f shader: Rename, implement FADD.SAT and P2R (imm) 2021-07-22 21:51:22 -04:00
ReinUsesLisp
e2bc05b17d shader: Add denorm flush support 2021-07-22 21:51:22 -04:00
ReinUsesLisp
6db69990da spirv: Add lower fp16 to fp32 pass 2021-07-22 21:51:22 -04:00
ReinUsesLisp
85cce78583 shader: Primitive Vulkan integration 2021-07-22 21:51:22 -04:00
ReinUsesLisp
c67d64365a shader: Remove old shader management 2021-07-22 21:51:22 -04:00
ReinUsesLisp
58914796c0 shader: Add XMAD multiplication folding optimization 2021-07-22 21:51:22 -04:00
ReinUsesLisp
4b438f94cf shader: Simplify ISCADD 2021-07-22 21:51:22 -04:00
ReinUsesLisp
3633e43377 shader: Add utility to resolve identities on a value 2021-07-22 21:51:22 -04:00
ReinUsesLisp
3a59fffaa1 spirv: Implement EmitIdentity 2021-07-22 21:51:22 -04:00
ReinUsesLisp
b5d7279d87 spirv: Initial bindings support 2021-07-22 21:51:22 -04:00
ReinUsesLisp
d5d468cf2c shader: Improve object pool 2021-07-22 21:51:22 -04:00
ReinUsesLisp
1c0b8bca5e shader: Fix tracking 2021-07-22 21:51:22 -04:00
ReinUsesLisp
1b0cf2309c shader: Add support for forward declarations 2021-07-22 21:51:22 -04:00
ReinUsesLisp
cbfb7d182a shader: Support SSA loops on IR 2021-07-22 21:51:22 -04:00
ReinUsesLisp
8af9297f09 shader: Misc fixes 2021-07-22 21:51:22 -04:00
ReinUsesLisp
9170200a11 shader: Initial implementation of an AST 2021-07-22 21:51:22 -04:00
ReinUsesLisp
2930dccecc spirv: Initial SPIR-V support 2021-07-22 21:51:22 -04:00
ReinUsesLisp
6dafb08f52 shader: Better constant folding 2021-07-22 21:51:22 -04:00
ReinUsesLisp
da8096e6e3 shader: Properly store phi on Inst 2021-07-22 21:51:21 -04:00
ReinUsesLisp
16cb00c521 shader: Add pools and rename files 2021-07-22 21:51:21 -04:00
ReinUsesLisp
be94ee88d2 shader: Make typed IR 2021-07-22 21:51:21 -04:00
ReinUsesLisp
dc04a50ac2 shader: Remove illegal character in SSA pass 2021-07-22 21:51:21 -04:00
ReinUsesLisp
e81739493a shader: Constant propagation and global memory to storage buffer 2021-07-22 21:51:21 -04:00
ReinUsesLisp
d24a16045f shader: Initial instruction support 2021-07-22 21:51:21 -04:00
ReinUsesLisp
6c4cc0cd06 shader: SSA and dominance 2021-07-22 21:51:21 -04:00
ReinUsesLisp
2d48a7b4d0 shader: Initial recompiler work 2021-07-22 21:51:21 -04:00
ameerj
75059c46d6 thread_worker: Fix compile time error
state is unused in the branch where with_state is false
2021-07-22 21:51:21 -04:00
bunnei
db46f8a70c Merge pull request #6686 from ReinUsesLisp/vk-optimal-copy
vk_texture_cache: Use VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL when possible
2021-07-22 12:51:13 -04:00
Morph
233bf018d6 Merge pull request #6693 from lat9nq/cmd-fullscreen-mode-2
yuzu_cmd: Make use of fullscreen_mode setting
2021-07-22 00:55:01 -04:00
bunnei
dff438e219 Merge pull request #6654 from german77/custom_threshold
input_common: Make button threshold customizable
2021-07-21 20:31:33 -04:00
bunnei
346bfb6c47 hle: service: kernel_helpers: Remove unnecessary pragma once. 2021-07-20 18:54:56 -07:00
bunnei
f3db3dcc8d hle: kernel: svc: Remove part of ExitProcess.
- ExitProcess is not actually implemented either way, and this needs more work before we implement.
2021-07-20 18:54:56 -07:00
bunnei
185b19fd5b hle: service: nvdrv: Remove unused kernel reference. 2021-07-20 18:54:56 -07:00
bunnei
6c6e730e9a hle: service: hid: npad: Remove unused kernel reference. 2021-07-20 18:54:56 -07:00
bunnei
52caa52cc2 hle: kernel: Track and release server sessions, and protect methods with locks. 2021-07-20 18:54:56 -07:00
bunnei
8d755147d8 hle: kernel: KProcess: Change process termination assert to a warning.
- Since we do not implement multiprocess right now, this should not be a crashing assert.
2021-07-20 18:54:56 -07:00
bunnei
854c7a3c28 hle: kernel: Ensure current running process is closed. 2021-07-20 18:54:56 -07:00
bunnei
ecf3653444 hle: kernel: Ensure global handle table is finalized before closing. 2021-07-20 18:54:56 -07:00
bunnei
24540e0ad9 kernel: svc: ConnectToNamedPort: Close extra reference to port. 2021-07-20 18:54:56 -07:00
bunnei
7bd020e030 hle: service: sm: Refactor to better manage ports. 2021-07-20 18:54:55 -07:00
bunnei
b119363fc2 hle: kernel: k_process: Close the handle table on shutdown. 2021-07-20 18:54:55 -07:00
bunnei
6020723e77 hle: kernel: k_process: Close main thread reference after it is inserted into handle table. 2021-07-20 18:54:55 -07:00
bunnei
fe402d3506 hle: kernel: Ensure global handle table is initialized. 2021-07-20 18:54:55 -07:00
bunnei
015058fadf hle: service: Add a helper module for managing kernel objects. 2021-07-20 18:54:55 -07:00
bunnei
929994132a hle: kernel: Provide methods for tracking dangling kernel objects. 2021-07-20 18:54:55 -07:00
ReinUsesLisp
a0c4557557 gl_buffer_cache: Use glClearNamedBufferSubData:GL_RED instead of GL_RGBA
Avoids reading out of bounds from the stack.
2021-07-20 18:51:45 -03:00
ReinUsesLisp
6e2ca7fbee buffer_cache: Simplify clear logic
Use existing helper functions and avoid looping when
only one buffer has to be active.
2021-07-20 18:50:51 -03:00
ReinUsesLisp
ad189488b3 vk_texture_cache: Use VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL when possible
Silences performance warnings generated from validation layers on each frame.
2021-07-20 14:38:58 -03:00
german77
2c339a5114 configure/ui: Add sliders for trigger buttons 2021-07-17 13:30:43 -05:00
german77
240019feca input_common: Make button threshold customizable 2021-07-15 23:56:57 -05:00
Fernando S
598d154f69 Update src/yuzu/main.cpp
Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com>
2021-07-09 12:28:51 +02:00
Fernando Sahmkow
fd09be5496 Settings: Eliminate ASYNC & MULTICORE Toggles and add GPU Accuracy Toggle. 2021-07-09 02:08:08 +02:00
lat9nq
ef70054367 cmake: Specify the compiler on autotools externals
Enables CCache on externals if available.
2021-07-06 12:54:24 -04:00
lat9nq
fbb26e6173 cmake, ci: Build bundled FFmpeg with yuzu
Drops usage of CMAKE_DEPENDENT_OPTION to allow using
YUZU_USE_BUNDLED_FFMPEG as an option on any platform. CI then now builds
FFmpeg always, netting about 10 MB less used on the AppImage.

Also somewhat fixes YUZU_USE_BUNDLED_QT so that it can be used even if
CMake doesn't clean up its state after running the first find_package.
2021-07-06 12:28:22 -04:00
619 changed files with 51963 additions and 28906 deletions

View File

@@ -18,7 +18,8 @@ cmake .. \
-DENABLE_COMPATIBILITY_LIST_DOWNLOAD=ON \
-DENABLE_QT_TRANSLATION=ON \
-DUSE_DISCORD_PRESENCE=ON \
-DYUZU_ENABLE_COMPATIBILITY_REPORTING=${ENABLE_COMPATIBILITY_REPORTING:-"OFF"}
-DYUZU_ENABLE_COMPATIBILITY_REPORTING=${ENABLE_COMPATIBILITY_REPORTING:-"OFF"} \
-DYUZU_USE_BUNDLED_FFMPEG=ON
make -j$(nproc)

View File

@@ -25,7 +25,7 @@ option(YUZU_USE_BUNDLED_BOOST "Download bundled Boost" OFF)
option(YUZU_USE_BUNDLED_LIBUSB "Compile bundled libusb" OFF)
CMAKE_DEPENDENT_OPTION(YUZU_USE_BUNDLED_FFMPEG "Download/Build bundled FFmpeg" ON "WIN32" OFF)
option(YUZU_USE_BUNDLED_FFMPEG "Download/Build bundled FFmpeg" "${WIN32}")
option(YUZU_USE_QT_WEB_ENGINE "Use QtWebEngine for web applet implementation" OFF)
@@ -496,7 +496,7 @@ endif()
# Ensure libusb is properly configured (based on dolphin libusb include)
if(NOT APPLE AND NOT YUZU_USE_BUNDLED_LIBUSB)
include(FindPkgConfig)
if (PKG_CONFIG_FOUND)
if (PKG_CONFIG_FOUND AND NOT CMAKE_SYSTEM_NAME MATCHES "DragonFly|FreeBSD")
pkg_check_modules(LIBUSB QUIET libusb-1.0>=1.0.24)
else()
find_package(LibUSB)
@@ -583,8 +583,32 @@ if (YUZU_USE_BUNDLED_FFMPEG)
"${FFmpeg_PREFIX};${FFmpeg_BUILD_DIR}"
CACHE PATH "Path to FFmpeg headers" FORCE)
if (${CMAKE_SYSTEM_NAME} STREQUAL "Linux")
Include(FindPkgConfig REQUIRED)
pkg_check_modules(LIBVA libva)
endif()
if(LIBVA_FOUND)
pkg_check_modules(LIBDRM libdrm REQUIRED)
find_package(X11 REQUIRED)
pkg_check_modules(LIBVA-DRM libva-drm REQUIRED)
pkg_check_modules(LIBVA-X11 libva-x11 REQUIRED)
set(FFmpeg_LIBVA_LIBRARIES
${LIBDRM_LIBRARIES}
${X11_LIBRARIES}
${LIBVA-DRM_LIBRARIES}
${LIBVA-X11_LIBRARIES}
${LIBVA_LIBRARIES})
set(FFmpeg_HWACCEL_FLAGS
--enable-hwaccel=h264_vaapi
--enable-hwaccel=vp9_vaapi
--enable-libdrm)
message(STATUS "VA-API found")
else()
set(FFmpeg_HWACCEL_FLAGS --disable-vaapi)
endif()
# `configure` parameters builds only exactly what yuzu needs from FFmpeg
# `--disable-{vaapi,vdpau}` is needed to avoid linking issues
# `--disable-vdpau` is needed to avoid linking issues
add_custom_command(
OUTPUT
${FFmpeg_MAKEFILE}
@@ -600,13 +624,16 @@ if (YUZU_USE_BUNDLED_FFMPEG)
--disable-network
--disable-postproc
--disable-swresample
--disable-vaapi
--disable-vdpau
--enable-decoder=h264
--enable-decoder=vp9
--cc="${CMAKE_C_COMPILER}"
--cxx="${CMAKE_CXX_COMPILER}"
${FFmpeg_HWACCEL_FLAGS}
WORKING_DIRECTORY
${FFmpeg_BUILD_DIR}
)
unset(FFmpeg_HWACCEL_FLAGS)
# Workaround for Ubuntu 18.04's older version of make not being able to call make as a child
# with context of the jobserver. Also helps ninja users.
@@ -616,9 +643,10 @@ if (YUZU_USE_BUNDLED_FFMPEG)
OUTPUT_VARIABLE
SYSTEM_THREADS)
set(FFmpeg_BUILD_LIBRARIES ${FFmpeg_LIBRARIES})
add_custom_command(
OUTPUT
${FFmpeg_LIBRARIES}
${FFmpeg_BUILD_LIBRARIES}
COMMAND
make -j${SYSTEM_THREADS}
WORKING_DIRECTORY
@@ -628,7 +656,12 @@ if (YUZU_USE_BUNDLED_FFMPEG)
# ALL makes this custom target build every time
# but it won't actually build if the DEPENDS parameter is up to date
add_custom_target(ffmpeg-configure ALL DEPENDS ${FFmpeg_MAKEFILE})
add_custom_target(ffmpeg-build ALL DEPENDS ${FFmpeg_LIBRARIES} ffmpeg-configure)
add_custom_target(ffmpeg-build ALL DEPENDS ${FFmpeg_BUILD_LIBRARIES} ffmpeg-configure)
link_libraries(${FFmpeg_LIBVA_LIBRARIES})
set(FFmpeg_LIBRARIES ${FFmpeg_LIBVA_LIBRARIES} ${FFmpeg_BUILD_LIBRARIES}
CACHE PATH "Paths to FFmpeg libraries" FORCE)
unset(FFmpeg_BUILD_LIBRARIES)
unset(FFmpeg_LIBVA_LIBRARIES)
if (FFmpeg_FOUND)
message(STATUS "Found FFmpeg version ${FFmpeg_VERSION}")

View File

@@ -48,69 +48,6 @@ if (BUILD_REPOSITORY)
endif()
endif()
# The variable SRC_DIR must be passed into the script (since it uses the current build directory for all values of CMAKE_*_DIR)
set(VIDEO_CORE "${SRC_DIR}/src/video_core")
set(HASH_FILES
"${VIDEO_CORE}/renderer_opengl/gl_arb_decompiler.cpp"
"${VIDEO_CORE}/renderer_opengl/gl_arb_decompiler.h"
"${VIDEO_CORE}/renderer_opengl/gl_shader_cache.cpp"
"${VIDEO_CORE}/renderer_opengl/gl_shader_cache.h"
"${VIDEO_CORE}/renderer_opengl/gl_shader_decompiler.cpp"
"${VIDEO_CORE}/renderer_opengl/gl_shader_decompiler.h"
"${VIDEO_CORE}/renderer_opengl/gl_shader_disk_cache.cpp"
"${VIDEO_CORE}/renderer_opengl/gl_shader_disk_cache.h"
"${VIDEO_CORE}/shader/decode/arithmetic.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_half.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_half_immediate.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_immediate.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_integer.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_integer_immediate.cpp"
"${VIDEO_CORE}/shader/decode/bfe.cpp"
"${VIDEO_CORE}/shader/decode/bfi.cpp"
"${VIDEO_CORE}/shader/decode/conversion.cpp"
"${VIDEO_CORE}/shader/decode/ffma.cpp"
"${VIDEO_CORE}/shader/decode/float_set.cpp"
"${VIDEO_CORE}/shader/decode/float_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/half_set.cpp"
"${VIDEO_CORE}/shader/decode/half_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/hfma2.cpp"
"${VIDEO_CORE}/shader/decode/image.cpp"
"${VIDEO_CORE}/shader/decode/integer_set.cpp"
"${VIDEO_CORE}/shader/decode/integer_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/memory.cpp"
"${VIDEO_CORE}/shader/decode/texture.cpp"
"${VIDEO_CORE}/shader/decode/other.cpp"
"${VIDEO_CORE}/shader/decode/predicate_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/predicate_set_register.cpp"
"${VIDEO_CORE}/shader/decode/register_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/shift.cpp"
"${VIDEO_CORE}/shader/decode/video.cpp"
"${VIDEO_CORE}/shader/decode/warp.cpp"
"${VIDEO_CORE}/shader/decode/xmad.cpp"
"${VIDEO_CORE}/shader/ast.cpp"
"${VIDEO_CORE}/shader/ast.h"
"${VIDEO_CORE}/shader/compiler_settings.cpp"
"${VIDEO_CORE}/shader/compiler_settings.h"
"${VIDEO_CORE}/shader/control_flow.cpp"
"${VIDEO_CORE}/shader/control_flow.h"
"${VIDEO_CORE}/shader/decode.cpp"
"${VIDEO_CORE}/shader/expr.cpp"
"${VIDEO_CORE}/shader/expr.h"
"${VIDEO_CORE}/shader/node.h"
"${VIDEO_CORE}/shader/node_helper.cpp"
"${VIDEO_CORE}/shader/node_helper.h"
"${VIDEO_CORE}/shader/registry.cpp"
"${VIDEO_CORE}/shader/registry.h"
"${VIDEO_CORE}/shader/shader_ir.cpp"
"${VIDEO_CORE}/shader/shader_ir.h"
"${VIDEO_CORE}/shader/track.cpp"
"${VIDEO_CORE}/shader/transform_feedback.cpp"
"${VIDEO_CORE}/shader/transform_feedback.h"
)
set(COMBINED "")
foreach (F IN LISTS HASH_FILES)
file(READ ${F} TMP)
set(COMBINED "${COMBINED}${TMP}")
endforeach()
string(MD5 SHADER_CACHE_VERSION "${COMBINED}")
# The variable SRC_DIR must be passed into the script
# (since it uses the current build directory for all values of CMAKE_*_DIR)
configure_file("${SRC_DIR}/src/common/scm_rev.cpp.in" "scm_rev.cpp" @ONLY)

View File

@@ -35,7 +35,7 @@ It is written in C++ with portability in mind, and we actively maintain builds f
The emulator is capable of running most commercial games at full speed, provided you meet the [necessary hardware requirements](https://yuzu-emu.org/help/quickstart/#hardware-requirements).
For a full list of games yuzu support, please visit our [Compatibility page](https://yuzu-emu.org/game/)
For a full list of games yuzu support, please visit our [Compatibility page](https://yuzu-emu.org/game/)
Check out our [website](https://yuzu-emu.org/) for the latest news on exciting features, monthly progress reports, and more!
@@ -43,7 +43,7 @@ Check out our [website](https://yuzu-emu.org/) for the latest news on exciting f
Most of the development happens on GitHub. It's also where [our central repository](https://github.com/yuzu-emu/yuzu) is hosted. For development discussion, please join us on [Discord](https://discord.com/invite/u77vRWY).
If you want to contribute, please take a look at the [Contributor's Guide](https://github.com/yuzu-emu/yuzu/wiki/Contributing) and [Developer Information](https://github.com/yuzu-emu/yuzu/wiki/Developer-Information).
If you want to contribute, please take a look at the [Contributor's Guide](https://github.com/yuzu-emu/yuzu/wiki/Contributing) and [Developer Information](https://github.com/yuzu-emu/yuzu/wiki/Developer-Information).
You can also contact any of the developers on Discord in order to know about the current state of the emulator.
If you want to contribute to the user interface translation project, please check out the [yuzu project on transifex](https://www.transifex.com/yuzu-emulator/yuzu). We centralize translation work there, and periodically upstream translations.
@@ -78,3 +78,5 @@ If you wish to support us a different way, please join our [Discord](https://dis
## License
yuzu is licensed under the GPLv2 (or any later version). Refer to the [license.txt](https://github.com/yuzu-emu/yuzu/blob/master/license.txt) file.
The [Skyline-Emulator Team](https://github.com/skyline-emu/skyline) is exempt from GPLv2 for the contributions from all these contributors [FernandoS27](https://github.com/FernandoS27), [lioncash](https://github.com/lioncash), [bunnei](https://github.com/bunnei), [ReinUsesLisp](https://github.com/ReinUsesLisp), [Morph1984](https://github.com/Morph1984), [ogniK5377](https://github.com/ogniK5377), [german77](https://github.com/german77), [ameerj](https://github.com/ameerj), [Kelebek1](https://github.com/Kelebek1) and [lat9nq](https://github.com/lat9nq). They may only use the code from these contributors under Mozilla Public License, version 2.0.

View File

@@ -38,6 +38,26 @@ QPushButton#RendererStatusBarButton:!checked {
color: #0066ff;
}
QPushButton#GPUStatusBarButton {
color: #656565;
border: 1px solid transparent;
background-color: transparent;
padding: 0px 3px 0px 3px;
text-align: center;
}
QPushButton#GPUStatusBarButton:hover {
border: 1px solid #76797C;
}
QPushButton#GPUStatusBarButton:checked {
color: #b06020;
}
QPushButton#GPUStatusBarButton:!checked {
color: #109010;
}
QPushButton#buttonRefreshDevices {
min-width: 21px;
min-height: 21px;

View File

@@ -1283,6 +1283,27 @@ QPushButton#RendererStatusBarButton:!checked {
color: #00ccdd;
}
QPushButton#GPUStatusBarButton {
min-width: 0px;
color: #656565;
border: 1px solid transparent;
background-color: transparent;
padding: 0px 3px 0px 3px;
text-align: center;
}
QPushButton#GPUStatusBarButton:hover {
border: 1px solid #76797C;
}
QPushButton#GPUStatusBarButton:checked {
color: #ff8040;
}
QPushButton#GPUStatusBarButton:!checked {
color: #40dd40;
}
QPushButton#buttonRefreshDevices {
min-width: 23px;
min-height: 23px;

View File

@@ -2186,6 +2186,27 @@ QPushButton#RendererStatusBarButton:!checked {
color: #00ccdd;
}
QPushButton#GPUStatusBarButton {
min-width: 0px;
color: #656565;
border: 1px solid transparent;
background-color: transparent;
padding: 0px 3px 0px 3px;
text-align: center;
}
QPushButton#GPUStatusBarButton:hover {
border: 1px solid #76797C;
}
QPushButton#GPUStatusBarButton:checked {
color: #ff8040;
}
QPushButton#GPUStatusBarButton:!checked {
color: #40dd40;
}
QPushButton#buttonRefreshDevices {
min-width: 19px;
min-height: 19px;

View File

@@ -67,6 +67,8 @@ if (MINGW OR (${CMAKE_SYSTEM_NAME} MATCHES "Linux") OR APPLE)
"${LIBUSB_MAKEFILE}"
COMMAND
env
CC="${CMAKE_C_COMPILER}"
CXX="${CMAKE_CXX_COMPILER}"
CFLAGS="${LIBUSB_CFLAGS}"
sh "${LIBUSB_CONFIGURE}"
${LIBUSB_CONFIGURE_ARGS}

View File

@@ -142,6 +142,7 @@ add_subdirectory(core)
add_subdirectory(audio_core)
add_subdirectory(video_core)
add_subdirectory(input_common)
add_subdirectory(shader_recompiler)
add_subdirectory(tests)
if (ENABLE_SDL2)

View File

@@ -1,8 +1,3 @@
# Add a custom command to generate a new shader_cache_version hash when any of the following files change
# NOTE: This is an approximation of what files affect shader generation, its possible something else
# could affect the result, but much more unlikely than the following files. Keeping a list of files
# like this allows for much better caching since it doesn't force the user to recompile binary shaders every update
set(VIDEO_CORE "${CMAKE_SOURCE_DIR}/src/video_core")
if (DEFINED ENV{AZURECIREPO})
set(BUILD_REPOSITORY $ENV{AZURECIREPO})
endif()
@@ -30,64 +25,7 @@ add_custom_command(OUTPUT scm_rev.cpp
-DGIT_EXECUTABLE=${GIT_EXECUTABLE}
-P ${CMAKE_SOURCE_DIR}/CMakeModules/GenerateSCMRev.cmake
DEPENDS
# WARNING! It was too much work to try and make a common location for this list,
# so if you need to change it, please update CMakeModules/GenerateSCMRev.cmake as well
"${VIDEO_CORE}/renderer_opengl/gl_arb_decompiler.cpp"
"${VIDEO_CORE}/renderer_opengl/gl_arb_decompiler.h"
"${VIDEO_CORE}/renderer_opengl/gl_shader_cache.cpp"
"${VIDEO_CORE}/renderer_opengl/gl_shader_cache.h"
"${VIDEO_CORE}/renderer_opengl/gl_shader_decompiler.cpp"
"${VIDEO_CORE}/renderer_opengl/gl_shader_decompiler.h"
"${VIDEO_CORE}/renderer_opengl/gl_shader_disk_cache.cpp"
"${VIDEO_CORE}/renderer_opengl/gl_shader_disk_cache.h"
"${VIDEO_CORE}/shader/decode/arithmetic.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_half.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_half_immediate.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_immediate.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_integer.cpp"
"${VIDEO_CORE}/shader/decode/arithmetic_integer_immediate.cpp"
"${VIDEO_CORE}/shader/decode/bfe.cpp"
"${VIDEO_CORE}/shader/decode/bfi.cpp"
"${VIDEO_CORE}/shader/decode/conversion.cpp"
"${VIDEO_CORE}/shader/decode/ffma.cpp"
"${VIDEO_CORE}/shader/decode/float_set.cpp"
"${VIDEO_CORE}/shader/decode/float_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/half_set.cpp"
"${VIDEO_CORE}/shader/decode/half_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/hfma2.cpp"
"${VIDEO_CORE}/shader/decode/image.cpp"
"${VIDEO_CORE}/shader/decode/integer_set.cpp"
"${VIDEO_CORE}/shader/decode/integer_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/memory.cpp"
"${VIDEO_CORE}/shader/decode/texture.cpp"
"${VIDEO_CORE}/shader/decode/other.cpp"
"${VIDEO_CORE}/shader/decode/predicate_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/predicate_set_register.cpp"
"${VIDEO_CORE}/shader/decode/register_set_predicate.cpp"
"${VIDEO_CORE}/shader/decode/shift.cpp"
"${VIDEO_CORE}/shader/decode/video.cpp"
"${VIDEO_CORE}/shader/decode/warp.cpp"
"${VIDEO_CORE}/shader/decode/xmad.cpp"
"${VIDEO_CORE}/shader/ast.cpp"
"${VIDEO_CORE}/shader/ast.h"
"${VIDEO_CORE}/shader/compiler_settings.cpp"
"${VIDEO_CORE}/shader/compiler_settings.h"
"${VIDEO_CORE}/shader/control_flow.cpp"
"${VIDEO_CORE}/shader/control_flow.h"
"${VIDEO_CORE}/shader/decode.cpp"
"${VIDEO_CORE}/shader/expr.cpp"
"${VIDEO_CORE}/shader/expr.h"
"${VIDEO_CORE}/shader/node.h"
"${VIDEO_CORE}/shader/node_helper.cpp"
"${VIDEO_CORE}/shader/node_helper.h"
"${VIDEO_CORE}/shader/registry.cpp"
"${VIDEO_CORE}/shader/registry.h"
"${VIDEO_CORE}/shader/shader_ir.cpp"
"${VIDEO_CORE}/shader/shader_ir.h"
"${VIDEO_CORE}/shader/track.cpp"
"${VIDEO_CORE}/shader/transform_feedback.cpp"
"${VIDEO_CORE}/shader/transform_feedback.h"
# and also check that the scm_rev files haven't changed
# Check that the scm_rev files haven't changed
"${CMAKE_CURRENT_SOURCE_DIR}/scm_rev.cpp.in"
"${CMAKE_CURRENT_SOURCE_DIR}/scm_rev.h"
# technically we should regenerate if the git version changed, but its not worth the effort imo
@@ -231,7 +169,7 @@ endif()
create_target_directory_groups(common)
target_link_libraries(common PUBLIC ${Boost_LIBRARIES} fmt::fmt microprofile)
target_link_libraries(common PUBLIC ${Boost_LIBRARIES} fmt::fmt microprofile Threads::Threads)
target_link_libraries(common PRIVATE lz4::lz4 xbyak)
if (MSVC)
target_link_libraries(common PRIVATE zstd::zstd)

View File

@@ -52,8 +52,12 @@ assert_noinline_call(const Fn& fn) {
#define DEBUG_ASSERT(_a_) ASSERT(_a_)
#define DEBUG_ASSERT_MSG(_a_, ...) ASSERT_MSG(_a_, __VA_ARGS__)
#else // not debug
#define DEBUG_ASSERT(_a_)
#define DEBUG_ASSERT_MSG(_a_, _desc_, ...)
#define DEBUG_ASSERT(_a_) \
do { \
} while (0)
#define DEBUG_ASSERT_MSG(_a_, _desc_, ...) \
do { \
} while (0)
#endif
#define UNIMPLEMENTED() ASSERT_MSG(false, "Unimplemented code!")

View File

@@ -20,6 +20,10 @@ std::string ToUTF8String(std::u8string_view u8_string) {
return std::string{u8_string.begin(), u8_string.end()};
}
std::string BufferToUTF8String(std::span<const u8> buffer) {
return std::string{buffer.begin(), std::ranges::find(buffer, u8{0})};
}
std::string PathToUTF8String(const std::filesystem::path& path) {
return ToUTF8String(path.u8string());
}

View File

@@ -46,6 +46,17 @@ concept IsChar = std::same_as<T, char>;
*/
[[nodiscard]] std::string ToUTF8String(std::u8string_view u8_string);
/**
* Converts a buffer of bytes to a UTF8-encoded std::string.
* This converts from the start of the buffer until the first encountered null-terminator.
* If no null-terminator is found, this converts the entire buffer instead.
*
* @param buffer Buffer of bytes
*
* @returns UTF-8 encoded std::string.
*/
[[nodiscard]] std::string BufferToUTF8String(std::span<const u8> buffer);
/**
* Converts a filesystem path to a UTF-8 encoded std::string.
*

View File

@@ -61,7 +61,7 @@ template <typename ContiguousContainer>
return out;
}
[[nodiscard]] constexpr std::array<u8, 16> AsArray(const char (&data)[17]) {
[[nodiscard]] constexpr std::array<u8, 16> AsArray(const char (&data)[33]) {
return HexStringToArray<16>(data);
}

View File

@@ -6,7 +6,7 @@
#include <windows.h>
#include "common/dynamic_library.h"
#elif defined(__linux__) // ^^^ Windows ^^^ vvv Linux vvv
#elif defined(__linux__) || defined(__FreeBSD__) // ^^^ Windows ^^^ vvv Linux vvv
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
@@ -343,7 +343,7 @@ private:
std::unordered_map<size_t, size_t> placeholder_host_pointers; ///< Placeholder backing offset
};
#elif defined(__linux__) // ^^^ Windows ^^^ vvv Linux vvv
#elif defined(__linux__) || defined(__FreeBSD__) // ^^^ Windows ^^^ vvv Linux vvv
class HostMemory::Impl {
public:
@@ -357,7 +357,12 @@ public:
});
// Backing memory initialization
#if defined(__FreeBSD__) && __FreeBSD__ < 13
// XXX Drop after FreeBSD 12.* reaches EOL on 2024-06-30
fd = shm_open(SHM_ANON, O_RDWR, 0600);
#else
fd = memfd_create("HostMemory", 0);
#endif
if (fd == -1) {
LOG_CRITICAL(HW_Memory, "memfd_create failed: {}", strerror(errno));
throw std::bad_alloc{};

View File

@@ -144,6 +144,10 @@ bool ParseFilterRule(Filter& instance, Iterator begin, Iterator end) {
SUB(Render, Software) \
SUB(Render, OpenGL) \
SUB(Render, Vulkan) \
CLS(Shader) \
SUB(Shader, SPIRV) \
SUB(Shader, GLASM) \
SUB(Shader, GLSL) \
CLS(Audio) \
SUB(Audio, DSP) \
SUB(Audio, Sink) \

View File

@@ -114,6 +114,10 @@ enum class Class : u8 {
Render_Software, ///< Software renderer backend
Render_OpenGL, ///< OpenGL backend
Render_Vulkan, ///< Vulkan backend
Shader, ///< Shader recompiler
Shader_SPIRV, ///< Shader SPIR-V code generation
Shader_GLASM, ///< Shader GLASM code generation
Shader_GLSL, ///< Shader GLSL code generation
Audio, ///< Audio emulation
Audio_DSP, ///< The HLE implementation of the DSP
Audio_Sink, ///< Emulator audio output backend

View File

@@ -14,7 +14,6 @@
#define BUILD_ID "@BUILD_ID@"
#define TITLE_BAR_FORMAT_IDLE "@TITLE_BAR_FORMAT_IDLE@"
#define TITLE_BAR_FORMAT_RUNNING "@TITLE_BAR_FORMAT_RUNNING@"
#define SHADER_CACHE_VERSION "@SHADER_CACHE_VERSION@"
namespace Common {
@@ -28,7 +27,6 @@ const char g_build_version[] = BUILD_VERSION;
const char g_build_id[] = BUILD_ID;
const char g_title_bar_format_idle[] = TITLE_BAR_FORMAT_IDLE;
const char g_title_bar_format_running[] = TITLE_BAR_FORMAT_RUNNING;
const char g_shader_cache_version[] = SHADER_CACHE_VERSION;
} // namespace

View File

@@ -48,8 +48,8 @@ void LogSettings() {
log_setting("Core_UseMultiCore", values.use_multi_core.GetValue());
log_setting("CPU_Accuracy", values.cpu_accuracy.GetValue());
log_setting("Renderer_UseResolutionFactor", values.resolution_factor.GetValue());
log_setting("Renderer_UseFrameLimit", values.use_frame_limit.GetValue());
log_setting("Renderer_FrameLimit", values.frame_limit.GetValue());
log_setting("Renderer_UseSpeedLimit", values.use_speed_limit.GetValue());
log_setting("Renderer_SpeedLimit", values.speed_limit.GetValue());
log_setting("Renderer_UseDiskShaderCache", values.use_disk_shader_cache.GetValue());
log_setting("Renderer_GPUAccuracyLevel", values.gpu_accuracy.GetValue());
log_setting("Renderer_UseAsynchronousGpuEmulation",
@@ -57,7 +57,7 @@ void LogSettings() {
log_setting("Renderer_UseNvdecEmulation", values.use_nvdec_emulation.GetValue());
log_setting("Renderer_AccelerateASTC", values.accelerate_astc.GetValue());
log_setting("Renderer_UseVsync", values.use_vsync.GetValue());
log_setting("Renderer_UseAssemblyShaders", values.use_assembly_shaders.GetValue());
log_setting("Renderer_ShaderBackend", values.shader_backend.GetValue());
log_setting("Renderer_UseAsynchronousShaders", values.use_asynchronous_shaders.GetValue());
log_setting("Renderer_UseGarbageCollection", values.use_caches_gc.GetValue());
log_setting("Renderer_AnisotropicFilteringLevel", values.max_anisotropy.GetValue());
@@ -132,15 +132,15 @@ void RestoreGlobalState(bool is_powered_on) {
values.vulkan_device.SetGlobal(true);
values.aspect_ratio.SetGlobal(true);
values.max_anisotropy.SetGlobal(true);
values.use_frame_limit.SetGlobal(true);
values.frame_limit.SetGlobal(true);
values.use_speed_limit.SetGlobal(true);
values.speed_limit.SetGlobal(true);
values.use_disk_shader_cache.SetGlobal(true);
values.gpu_accuracy.SetGlobal(true);
values.use_asynchronous_gpu_emulation.SetGlobal(true);
values.use_nvdec_emulation.SetGlobal(true);
values.accelerate_astc.SetGlobal(true);
values.use_vsync.SetGlobal(true);
values.use_assembly_shaders.SetGlobal(true);
values.shader_backend.SetGlobal(true);
values.use_asynchronous_shaders.SetGlobal(true);
values.use_fast_gpu_time.SetGlobal(true);
values.use_caches_gc.SetGlobal(true);

View File

@@ -24,6 +24,12 @@ enum class RendererBackend : u32 {
Vulkan = 1,
};
enum class ShaderBackend : u32 {
GLSL = 0,
GLASM = 1,
SPIRV = 2,
};
enum class GPUAccuracy : u32 {
Normal = 0,
High = 1,
@@ -36,6 +42,11 @@ enum class CPUAccuracy : u32 {
Unsafe = 2,
};
enum class FullscreenMode : u32 {
Borderless = 0,
Exclusive = 1,
};
/** The BasicSetting class is a simple resource manager. It defines a label and default value
* alongside the actual value of the setting for simpler and less-error prone use with frontend
* configurations. Setting a default value and label is required, though subclasses may deviate from
@@ -308,30 +319,35 @@ struct Values {
// Renderer
Setting<RendererBackend> renderer_backend{RendererBackend::OpenGL, "backend"};
BasicSetting<bool> renderer_debug{false, "debug"};
BasicSetting<bool> renderer_shader_feedback{false, "shader_feedback"};
BasicSetting<bool> enable_nsight_aftermath{false, "nsight_aftermath"};
BasicSetting<bool> disable_shader_loop_safety_checks{false,
"disable_shader_loop_safety_checks"};
Setting<int> vulkan_device{0, "vulkan_device"};
Setting<u16> resolution_factor{1, "resolution_factor"};
// *nix platforms may have issues with the borderless windowed fullscreen mode.
// Default to exclusive fullscreen on these platforms for now.
Setting<int> fullscreen_mode{
Setting<FullscreenMode> fullscreen_mode{
#ifdef _WIN32
0,
FullscreenMode::Borderless,
#else
1,
FullscreenMode::Exclusive,
#endif
"fullscreen_mode"};
Setting<int> aspect_ratio{0, "aspect_ratio"};
Setting<int> max_anisotropy{0, "max_anisotropy"};
Setting<bool> use_frame_limit{true, "use_frame_limit"};
Setting<u16> frame_limit{100, "frame_limit"};
Setting<bool> use_speed_limit{true, "use_speed_limit"};
Setting<u16> speed_limit{100, "speed_limit"};
Setting<bool> use_disk_shader_cache{true, "use_disk_shader_cache"};
Setting<GPUAccuracy> gpu_accuracy{GPUAccuracy::High, "gpu_accuracy"};
Setting<bool> use_asynchronous_gpu_emulation{true, "use_asynchronous_gpu_emulation"};
Setting<bool> use_nvdec_emulation{true, "use_nvdec_emulation"};
Setting<bool> accelerate_astc{true, "accelerate_astc"};
Setting<bool> use_vsync{true, "use_vsync"};
BasicSetting<u16> fps_cap{1000, "fps_cap"};
BasicSetting<bool> disable_fps_limit{false, "disable_fps_limit"};
Setting<bool> use_assembly_shaders{false, "use_assembly_shaders"};
Setting<ShaderBackend> shader_backend{ShaderBackend::GLASM, "shader_backend"};
Setting<bool> use_asynchronous_shaders{false, "use_asynchronous_shaders"};
Setting<bool> use_fast_gpu_time{true, "use_fast_gpu_time"};
Setting<bool> use_caches_gc{false, "use_caches_gc"};

View File

@@ -5,6 +5,7 @@
#pragma once
#include <atomic>
#include <condition_variable>
#include <functional>
#include <mutex>
#include <stop_token>
@@ -39,7 +40,7 @@ public:
const auto lambda = [this, func](std::stop_token stop_token) {
Common::SetCurrentThreadName(thread_name.c_str());
{
std::conditional_t<with_state, StateType, int> state{func()};
[[maybe_unused]] std::conditional_t<with_state, StateType, int> state{func()};
while (!stop_token.stop_requested()) {
Task task;
{

View File

@@ -6,10 +6,64 @@
#include <fmt/format.h>
#include "common/assert.h"
#include "common/uuid.h"
namespace Common {
namespace {
bool IsHexDigit(char c) {
return (c >= '0' && c <= '9') || (c >= 'a' && c <= 'f') || (c >= 'A' && c <= 'F');
}
u8 HexCharToByte(char c) {
if (c >= '0' && c <= '9') {
return static_cast<u8>(c - '0');
}
if (c >= 'a' && c <= 'f') {
return static_cast<u8>(c - 'a' + 10);
}
if (c >= 'A' && c <= 'F') {
return static_cast<u8>(c - 'A' + 10);
}
ASSERT_MSG(false, "{} is not a hexadecimal digit!", c);
return u8{0};
}
} // Anonymous namespace
u128 HexStringToU128(std::string_view hex_string) {
const size_t length = hex_string.length();
// Detect "0x" prefix.
const bool has_0x_prefix = length > 2 && hex_string[0] == '0' && hex_string[1] == 'x';
const size_t offset = has_0x_prefix ? 2 : 0;
// Check length.
if (length > 32 + offset) {
ASSERT_MSG(false, "hex_string has more than 32 hexadecimal characters!");
return INVALID_UUID;
}
u64 lo = 0;
u64 hi = 0;
for (size_t i = 0; i < length - offset; ++i) {
const char c = hex_string[length - 1 - i];
if (!IsHexDigit(c)) {
ASSERT_MSG(false, "{} is not a hexadecimal digit!", c);
return INVALID_UUID;
}
if (i < 16) {
lo |= u64{HexCharToByte(c)} << (i * 4);
}
if (i >= 16) {
hi |= u64{HexCharToByte(c)} << ((i - 16) * 4);
}
}
return u128{lo, hi};
}
UUID UUID::Generate() {
std::random_device device;
std::mt19937 gen(device());
@@ -18,7 +72,7 @@ UUID UUID::Generate() {
}
std::string UUID::Format() const {
return fmt::format("0x{:016X}{:016X}", uuid[1], uuid[0]);
return fmt::format("{:016x}{:016x}", uuid[1], uuid[0]);
}
std::string UUID::FormatSwitch() const {

View File

@@ -5,6 +5,7 @@
#pragma once
#include <string>
#include <string_view>
#include "common/common_types.h"
@@ -12,12 +13,30 @@ namespace Common {
constexpr u128 INVALID_UUID{{0, 0}};
/**
* Converts a hex string to a 128-bit unsigned integer.
*
* The hex string can be formatted in lowercase or uppercase, with or without the "0x" prefix.
*
* This function will assert and return INVALID_UUID under the following conditions:
* - If the hex string is more than 32 characters long
* - If the hex string contains non-hexadecimal characters
*
* @param hex_string Hexadecimal string
*
* @returns A 128-bit unsigned integer if successfully converted, INVALID_UUID otherwise.
*/
[[nodiscard]] u128 HexStringToU128(std::string_view hex_string);
struct UUID {
// UUIDs which are 0 are considered invalid!
u128 uuid;
UUID() = default;
constexpr explicit UUID(const u128& id) : uuid{id} {}
constexpr explicit UUID(const u64 lo, const u64 hi) : uuid{{lo, hi}} {}
explicit UUID(std::string_view hex_string) {
uuid = HexStringToU128(hex_string);
}
[[nodiscard]] constexpr explicit operator bool() const {
return uuid != INVALID_UUID;
@@ -50,3 +69,14 @@ struct UUID {
static_assert(sizeof(UUID) == 16, "UUID is an invalid size!");
} // namespace Common
namespace std {
template <>
struct hash<Common::UUID> {
size_t operator()(const Common::UUID& uuid) const noexcept {
return uuid.uuid[1] ^ uuid.uuid[0];
}
};
} // namespace std

View File

@@ -517,6 +517,8 @@ add_library(core STATIC
hle/service/psc/psc.h
hle/service/ptm/psm.cpp
hle/service/ptm/psm.h
hle/service/kernel_helpers.cpp
hle/service/kernel_helpers.h
hle/service/service.cpp
hle/service/service.h
hle/service/set/set.cpp

View File

@@ -411,7 +411,7 @@ struct System::Impl {
std::string status_details = "";
std::unique_ptr<Core::PerfStats> perf_stats;
Core::FrameLimiter frame_limiter;
Core::SpeedLimiter speed_limiter;
bool is_multicore{};
bool is_async_gpu{};
@@ -606,12 +606,12 @@ const Core::PerfStats& System::GetPerfStats() const {
return *impl->perf_stats;
}
Core::FrameLimiter& System::FrameLimiter() {
return impl->frame_limiter;
Core::SpeedLimiter& System::SpeedLimiter() {
return impl->speed_limiter;
}
const Core::FrameLimiter& System::FrameLimiter() const {
return impl->frame_limiter;
const Core::SpeedLimiter& System::SpeedLimiter() const {
return impl->speed_limiter;
}
Loader::ResultStatus System::GetGameName(std::string& out) const {

View File

@@ -94,7 +94,7 @@ class ARM_Interface;
class CpuManager;
class DeviceMemory;
class ExclusiveMonitor;
class FrameLimiter;
class SpeedLimiter;
class PerfStats;
class Reporter;
class TelemetrySession;
@@ -292,11 +292,11 @@ public:
/// Provides a constant reference to the internal PerfStats instance.
[[nodiscard]] const Core::PerfStats& GetPerfStats() const;
/// Provides a reference to the frame limiter;
[[nodiscard]] Core::FrameLimiter& FrameLimiter();
/// Provides a reference to the speed limiter;
[[nodiscard]] Core::SpeedLimiter& SpeedLimiter();
/// Provides a constant referent to the frame limiter
[[nodiscard]] const Core::FrameLimiter& FrameLimiter() const;
/// Provides a constant reference to the speed limiter
[[nodiscard]] const Core::SpeedLimiter& SpeedLimiter() const;
/// Gets the name of the current game
[[nodiscard]] Loader::ResultStatus GetGameName(std::string& out) const;

View File

@@ -12,9 +12,9 @@ namespace HLE::ApiVersion {
// Horizon OS version constants.
constexpr u8 HOS_VERSION_MAJOR = 11;
constexpr u8 HOS_VERSION_MINOR = 0;
constexpr u8 HOS_VERSION_MICRO = 1;
constexpr u8 HOS_VERSION_MAJOR = 12;
constexpr u8 HOS_VERSION_MINOR = 1;
constexpr u8 HOS_VERSION_MICRO = 0;
// NintendoSDK version constants.
@@ -22,15 +22,15 @@ constexpr u8 SDK_REVISION_MAJOR = 1;
constexpr u8 SDK_REVISION_MINOR = 0;
constexpr char PLATFORM_STRING[] = "NX";
constexpr char VERSION_HASH[] = "69103fcb2004dace877094c2f8c29e6113be5dbf";
constexpr char DISPLAY_VERSION[] = "11.0.1";
constexpr char DISPLAY_TITLE[] = "NintendoSDK Firmware for NX 11.0.1-1.0";
constexpr char VERSION_HASH[] = "76b10c2dab7d3aa73fc162f8dff1655e6a21caf4";
constexpr char DISPLAY_VERSION[] = "12.1.0";
constexpr char DISPLAY_TITLE[] = "NintendoSDK Firmware for NX 12.1.0-1.0";
// Atmosphere version constants.
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MAJOR = 0;
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MINOR = 19;
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MICRO = 4;
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MICRO = 5;
constexpr u32 GetTargetFirmware() {
return u32{HOS_VERSION_MAJOR} << 24 | u32{HOS_VERSION_MINOR} << 16 |

View File

@@ -58,6 +58,9 @@ bool SessionRequestManager::HasSessionRequestHandler(const HLERequestContext& co
void SessionRequestHandler::ClientConnected(KServerSession* session) {
session->ClientConnected(shared_from_this());
// Ensure our server session is tracked globally.
kernel.RegisterServerSession(session);
}
void SessionRequestHandler::ClientDisconnected(KServerSession* session) {

View File

@@ -3,6 +3,7 @@
// Refer to the license.txt file included.
#include "core/hle/kernel/k_auto_object.h"
#include "core/hle/kernel/kernel.h"
namespace Kernel {
@@ -11,4 +12,12 @@ KAutoObject* KAutoObject::Create(KAutoObject* obj) {
return obj;
}
void KAutoObject::RegisterWithKernel() {
kernel.RegisterKernelObject(this);
}
void KAutoObject::UnregisterWithKernel() {
kernel.UnregisterKernelObject(this);
}
} // namespace Kernel

View File

@@ -85,8 +85,12 @@ private:
KERNEL_AUTOOBJECT_TRAITS(KAutoObject, KAutoObject);
public:
explicit KAutoObject(KernelCore& kernel_) : kernel(kernel_) {}
virtual ~KAutoObject() = default;
explicit KAutoObject(KernelCore& kernel_) : kernel(kernel_) {
RegisterWithKernel();
}
virtual ~KAutoObject() {
UnregisterWithKernel();
}
static KAutoObject* Create(KAutoObject* ptr);
@@ -166,6 +170,10 @@ public:
}
}
private:
void RegisterWithKernel();
void UnregisterWithKernel();
protected:
KernelCore& kernel;
std::string name;

View File

@@ -10,6 +10,7 @@
#include "common/alignment.h"
#include "common/assert.h"
#include "common/logging/log.h"
#include "common/scope_exit.h"
#include "common/settings.h"
#include "core/core.h"
#include "core/device_memory.h"
@@ -43,6 +44,8 @@ void SetupMainThread(Core::System& system, KProcess& owner_process, u32 priority
ASSERT(owner_process.GetResourceLimit()->Reserve(LimitableResource::Threads, 1));
KThread* thread = KThread::Create(system.Kernel());
SCOPE_EXIT({ thread->Close(); });
ASSERT(KThread::InitializeUserThread(system, thread, entry_point, 0, stack_top, priority,
owner_process.GetIdealCoreId(), &owner_process)
.IsSuccess());
@@ -162,7 +165,7 @@ void KProcess::DecrementThreadCount() {
ASSERT(num_threads > 0);
if (const auto count = --num_threads; count == 0) {
UNIMPLEMENTED_MSG("Process termination is not implemented!");
LOG_WARNING(Kernel, "Process termination is not fully implemented.");
}
}
@@ -406,6 +409,9 @@ void KProcess::Finalize() {
resource_limit->Close();
}
// Finalize the handle table and close any open handles.
handle_table.Finalize();
// Perform inherited finalization.
KAutoObjectWithSlabHeapAndContainer<KProcess, KSynchronizationObject>::Finalize();
}

View File

@@ -28,7 +28,10 @@ namespace Kernel {
KServerSession::KServerSession(KernelCore& kernel_) : KSynchronizationObject{kernel_} {}
KServerSession::~KServerSession() {}
KServerSession::~KServerSession() {
// Ensure that the global list tracking server sessions does not hold on to a reference.
kernel.UnregisterServerSession(this);
}
void KServerSession::Initialize(KSession* parent_session_, std::string&& name_,
std::shared_ptr<SessionRequestManager> manager_) {

View File

@@ -61,6 +61,7 @@ struct KernelCore::Impl {
void Initialize(KernelCore& kernel) {
global_scheduler_context = std::make_unique<Kernel::GlobalSchedulerContext>(kernel);
global_handle_table = std::make_unique<Kernel::KHandleTable>(kernel);
global_handle_table->Initialize(KHandleTable::MaxTableSize);
is_phantom_mode_for_singlecore = false;
@@ -90,9 +91,39 @@ struct KernelCore::Impl {
}
void Shutdown() {
// Shutdown all processes.
if (current_process) {
current_process->Finalize();
current_process->Close();
current_process = nullptr;
}
process_list.clear();
// Ensures all service threads gracefully shutdown
// Close all open server ports.
std::unordered_set<KServerPort*> server_ports_;
{
std::lock_guard lk(server_ports_lock);
server_ports_ = server_ports;
server_ports.clear();
}
for (auto* server_port : server_ports_) {
server_port->Close();
}
// Close all open server sessions.
std::unordered_set<KServerSession*> server_sessions_;
{
std::lock_guard lk(server_sessions_lock);
server_sessions_ = server_sessions;
server_sessions.clear();
}
for (auto* server_session : server_sessions_) {
server_session->Close();
}
// Ensure that the object list container is finalized and properly shutdown.
object_list_container.Finalize();
// Ensures all service threads gracefully shutdown.
service_threads.clear();
next_object_id = 0;
@@ -111,11 +142,7 @@ struct KernelCore::Impl {
cores.clear();
if (current_process) {
current_process->Close();
current_process = nullptr;
}
global_handle_table->Finalize();
global_handle_table.reset();
preemption_event = nullptr;
@@ -142,6 +169,16 @@ struct KernelCore::Impl {
// Next host thead ID to use, 0-3 IDs represent core threads, >3 represent others
next_host_thread_id = Core::Hardware::NUM_CPU_CORES;
// Track kernel objects that were not freed on shutdown
{
std::lock_guard lk(registered_objects_lock);
if (registered_objects.size()) {
LOG_WARNING(Kernel, "{} kernel objects were dangling on shutdown!",
registered_objects.size());
registered_objects.clear();
}
}
}
void InitializePhysicalCores() {
@@ -630,6 +667,21 @@ struct KernelCore::Impl {
user_slab_heap_size);
}
KClientPort* CreateNamedServicePort(std::string name) {
auto search = service_interface_factory.find(name);
if (search == service_interface_factory.end()) {
UNIMPLEMENTED();
return {};
}
KClientPort* port = &search->second(system.ServiceManager(), system);
{
std::lock_guard lk(server_ports_lock);
server_ports.insert(&port->GetParent()->GetServerPort());
}
return port;
}
std::atomic<u32> next_object_id{0};
std::atomic<u64> next_kernel_process_id{KProcess::InitialKIPIDMin};
std::atomic<u64> next_user_process_id{KProcess::ProcessIDMin};
@@ -656,6 +708,12 @@ struct KernelCore::Impl {
/// the ConnectToPort SVC.
std::unordered_map<std::string, ServiceInterfaceFactory> service_interface_factory;
NamedPortTable named_ports;
std::unordered_set<KServerPort*> server_ports;
std::unordered_set<KServerSession*> server_sessions;
std::unordered_set<KAutoObject*> registered_objects;
std::mutex server_ports_lock;
std::mutex server_sessions_lock;
std::mutex registered_objects_lock;
std::unique_ptr<Core::ExclusiveMonitor> exclusive_monitor;
std::vector<Kernel::PhysicalCore> cores;
@@ -844,12 +902,27 @@ void KernelCore::RegisterNamedService(std::string name, ServiceInterfaceFactory&
}
KClientPort* KernelCore::CreateNamedServicePort(std::string name) {
auto search = impl->service_interface_factory.find(name);
if (search == impl->service_interface_factory.end()) {
UNIMPLEMENTED();
return {};
}
return &search->second(impl->system.ServiceManager(), impl->system);
return impl->CreateNamedServicePort(std::move(name));
}
void KernelCore::RegisterServerSession(KServerSession* server_session) {
std::lock_guard lk(impl->server_sessions_lock);
impl->server_sessions.insert(server_session);
}
void KernelCore::UnregisterServerSession(KServerSession* server_session) {
std::lock_guard lk(impl->server_sessions_lock);
impl->server_sessions.erase(server_session);
}
void KernelCore::RegisterKernelObject(KAutoObject* object) {
std::lock_guard lk(impl->registered_objects_lock);
impl->registered_objects.insert(object);
}
void KernelCore::UnregisterKernelObject(KAutoObject* object) {
std::lock_guard lk(impl->registered_objects_lock);
impl->registered_objects.erase(object);
}
bool KernelCore::IsValidNamedPort(NamedPortTable::const_iterator port) const {

View File

@@ -45,6 +45,7 @@ class KPort;
class KProcess;
class KResourceLimit;
class KScheduler;
class KServerSession;
class KSession;
class KSharedMemory;
class KThread;
@@ -185,6 +186,22 @@ public:
/// Opens a port to a service previously registered with RegisterNamedService.
KClientPort* CreateNamedServicePort(std::string name);
/// Registers a server session with the gobal emulation state, to be freed on shutdown. This is
/// necessary because we do not emulate processes for HLE sessions.
void RegisterServerSession(KServerSession* server_session);
/// Unregisters a server session previously registered with RegisterServerSession when it was
/// destroyed during the current emulation session.
void UnregisterServerSession(KServerSession* server_session);
/// Registers all kernel objects with the global emulation state, this is purely for tracking
/// leaks after emulation has been shutdown.
void RegisterKernelObject(KAutoObject* object);
/// Unregisters a kernel object previously registered with RegisterKernelObject when it was
/// destroyed during the current emulation session.
void UnregisterKernelObject(KAutoObject* object);
/// Determines whether or not the given port is a valid named port.
bool IsValidNamedPort(NamedPortTable::const_iterator port) const;

View File

@@ -298,6 +298,7 @@ static ResultCode ConnectToNamedPort(Core::System& system, Handle* out, VAddr po
// Create a session.
KClientSession* session{};
R_TRY(port->CreateSession(std::addressof(session)));
port->Close();
// Register the session in the table, close the extra reference.
handle_table.Register(*out, session);
@@ -1439,11 +1440,6 @@ static void ExitProcess(Core::System& system) {
LOG_INFO(Kernel_SVC, "Process {} exiting", current_process->GetProcessID());
ASSERT_MSG(current_process->GetStatus() == ProcessStatus::Running,
"Process has already exited");
current_process->PrepareForTermination();
// Kill the current thread
system.Kernel().CurrentScheduler()->GetCurrentThread()->Exit();
}
static void ExitProcess32(Core::System& system) {

View File

@@ -292,7 +292,7 @@ public:
protected:
void Get(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_ACC, "called user_id={}", user_id.Format());
LOG_DEBUG(Service_ACC, "called user_id=0x{}", user_id.Format());
ProfileBase profile_base{};
ProfileData data{};
if (profile_manager.GetProfileBaseAndData(user_id, profile_base, data)) {
@@ -301,7 +301,7 @@ protected:
rb.Push(ResultSuccess);
rb.PushRaw(profile_base);
} else {
LOG_ERROR(Service_ACC, "Failed to get profile base and data for user={}",
LOG_ERROR(Service_ACC, "Failed to get profile base and data for user=0x{}",
user_id.Format());
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(ResultUnknown); // TODO(ogniK): Get actual error code
@@ -309,14 +309,14 @@ protected:
}
void GetBase(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_ACC, "called user_id={}", user_id.Format());
LOG_DEBUG(Service_ACC, "called user_id=0x{}", user_id.Format());
ProfileBase profile_base{};
if (profile_manager.GetProfileBase(user_id, profile_base)) {
IPC::ResponseBuilder rb{ctx, 16};
rb.Push(ResultSuccess);
rb.PushRaw(profile_base);
} else {
LOG_ERROR(Service_ACC, "Failed to get profile base for user={}", user_id.Format());
LOG_ERROR(Service_ACC, "Failed to get profile base for user=0x{}", user_id.Format());
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(ResultUnknown); // TODO(ogniK): Get actual error code
}
@@ -372,7 +372,7 @@ protected:
const auto user_data = ctx.ReadBuffer();
LOG_DEBUG(Service_ACC, "called, username='{}', timestamp={:016X}, uuid={}",
LOG_DEBUG(Service_ACC, "called, username='{}', timestamp={:016X}, uuid=0x{}",
Common::StringFromFixedZeroTerminatedBuffer(
reinterpret_cast<const char*>(base.username.data()), base.username.size()),
base.timestamp, base.user_uuid.Format());
@@ -405,7 +405,7 @@ protected:
const auto user_data = ctx.ReadBuffer();
const auto image_data = ctx.ReadBuffer(1);
LOG_DEBUG(Service_ACC, "called, username='{}', timestamp={:016X}, uuid={}",
LOG_DEBUG(Service_ACC, "called, username='{}', timestamp={:016X}, uuid=0x{}",
Common::StringFromFixedZeroTerminatedBuffer(
reinterpret_cast<const char*>(base.username.data()), base.username.size()),
base.timestamp, base.user_uuid.Format());
@@ -662,7 +662,7 @@ void Module::Interface::GetUserCount(Kernel::HLERequestContext& ctx) {
void Module::Interface::GetUserExistence(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
Common::UUID user_id = rp.PopRaw<Common::UUID>();
LOG_DEBUG(Service_ACC, "called user_id={}", user_id.Format());
LOG_DEBUG(Service_ACC, "called user_id=0x{}", user_id.Format());
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(ResultSuccess);
@@ -693,7 +693,7 @@ void Module::Interface::GetLastOpenedUser(Kernel::HLERequestContext& ctx) {
void Module::Interface::GetProfile(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
Common::UUID user_id = rp.PopRaw<Common::UUID>();
LOG_DEBUG(Service_ACC, "called user_id={}", user_id.Format());
LOG_DEBUG(Service_ACC, "called user_id=0x{}", user_id.Format());
IPC::ResponseBuilder rb{ctx, 2, 0, 1};
rb.Push(ResultSuccess);
@@ -802,7 +802,7 @@ void Module::Interface::GetProfileEditor(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
Common::UUID user_id = rp.PopRaw<Common::UUID>();
LOG_DEBUG(Service_ACC, "called, user_id={}", user_id.Format());
LOG_DEBUG(Service_ACC, "called, user_id=0x{}", user_id.Format());
IPC::ResponseBuilder rb{ctx, 2, 0, 1};
rb.Push(ResultSuccess);
@@ -844,7 +844,7 @@ void Module::Interface::StoreSaveDataThumbnailApplication(Kernel::HLERequestCont
IPC::RequestParser rp{ctx};
const auto uuid = rp.PopRaw<Common::UUID>();
LOG_WARNING(Service_ACC, "(STUBBED) called, uuid={}", uuid.Format());
LOG_WARNING(Service_ACC, "(STUBBED) called, uuid=0x{}", uuid.Format());
// TODO(ogniK): Check if application ID is zero on acc initialize. As we don't have a reliable
// way of confirming things like the TID, we're going to assume a non zero value for the time
@@ -858,7 +858,7 @@ void Module::Interface::StoreSaveDataThumbnailSystem(Kernel::HLERequestContext&
const auto uuid = rp.PopRaw<Common::UUID>();
const auto tid = rp.Pop<u64_le>();
LOG_WARNING(Service_ACC, "(STUBBED) called, uuid={}, tid={:016X}", uuid.Format(), tid);
LOG_WARNING(Service_ACC, "(STUBBED) called, uuid=0x{}, tid={:016X}", uuid.Format(), tid);
StoreSaveDataThumbnail(ctx, uuid, tid);
}

View File

@@ -377,7 +377,8 @@ void SoftwareKeyboard::SubmitForTextCheck(std::u16string submitted_text) {
if (swkbd_config_common.use_utf8) {
std::string utf8_submitted_text = Common::UTF16ToUTF8(current_text);
const u64 buffer_size = sizeof(u64) + utf8_submitted_text.size();
// Include the null terminator in the buffer size.
const u64 buffer_size = utf8_submitted_text.size() + 1;
LOG_DEBUG(Service_AM, "\nBuffer Size: {}\nUTF-8 Submitted Text: {}", buffer_size,
utf8_submitted_text);
@@ -386,7 +387,8 @@ void SoftwareKeyboard::SubmitForTextCheck(std::u16string submitted_text) {
std::memcpy(out_data.data() + sizeof(u64), utf8_submitted_text.data(),
utf8_submitted_text.size());
} else {
const u64 buffer_size = sizeof(u64) + current_text.size() * sizeof(char16_t);
// Include the null terminator in the buffer size.
const u64 buffer_size = (current_text.size() + 1) * sizeof(char16_t);
LOG_DEBUG(Service_AM, "\nBuffer Size: {}\nUTF-16 Submitted Text: {}", buffer_size,
Common::UTF16ToUTF8(current_text));

View File

@@ -158,7 +158,7 @@ private:
const auto local_play = rp.Pop<bool>();
const auto uuid = rp.PopRaw<Common::UUID>();
LOG_WARNING(Service_Friend, "(STUBBED) called local_play={} uuid={}", local_play,
LOG_WARNING(Service_Friend, "(STUBBED) called, local_play={}, uuid=0x{}", local_play,
uuid.Format());
IPC::ResponseBuilder rb{ctx, 2};
@@ -171,7 +171,7 @@ private:
const auto uuid = rp.PopRaw<Common::UUID>();
[[maybe_unused]] const auto filter = rp.PopRaw<SizedFriendFilter>();
const auto pid = rp.Pop<u64>();
LOG_WARNING(Service_Friend, "(STUBBED) called, offset={}, uuid={}, pid={}", friend_offset,
LOG_WARNING(Service_Friend, "(STUBBED) called, offset={}, uuid=0x{}, pid={}", friend_offset,
uuid.Format(), pid);
IPC::ResponseBuilder rb{ctx, 3};
@@ -289,7 +289,7 @@ void Module::Interface::CreateNotificationService(Kernel::HLERequestContext& ctx
IPC::RequestParser rp{ctx};
auto uuid = rp.PopRaw<Common::UUID>();
LOG_DEBUG(Service_Friend, "called, uuid={}", uuid.Format());
LOG_DEBUG(Service_Friend, "called, uuid=0x{}", uuid.Format());
IPC::ResponseBuilder rb{ctx, 2, 0, 1};
rb.Push(ResultSuccess);

View File

@@ -18,6 +18,7 @@
#include "core/hle/kernel/k_writable_event.h"
#include "core/hle/kernel/kernel.h"
#include "core/hle/service/hid/controllers/npad.h"
#include "core/hle/service/kernel_helpers.h"
namespace Service::HID {
constexpr s32 HID_JOYSTICK_MAX = 0x7fff;
@@ -147,7 +148,9 @@ bool Controller_NPad::IsDeviceHandleValid(const DeviceHandle& device_handle) {
device_handle.device_index < DeviceIndex::MaxDeviceIndex;
}
Controller_NPad::Controller_NPad(Core::System& system_) : ControllerBase{system_} {
Controller_NPad::Controller_NPad(Core::System& system_,
KernelHelpers::ServiceContext& service_context_)
: ControllerBase{system_}, service_context{service_context_} {
latest_vibration_values.fill({DEFAULT_VIBRATION_VALUE, DEFAULT_VIBRATION_VALUE});
}
@@ -251,10 +254,9 @@ void Controller_NPad::InitNewlyAddedController(std::size_t controller_idx) {
}
void Controller_NPad::OnInit() {
auto& kernel = system.Kernel();
for (std::size_t i = 0; i < styleset_changed_events.size(); ++i) {
styleset_changed_events[i] = Kernel::KEvent::Create(kernel);
styleset_changed_events[i]->Initialize(fmt::format("npad:NpadStyleSetChanged_{}", i));
styleset_changed_events[i] =
service_context.CreateEvent(fmt::format("npad:NpadStyleSetChanged_{}", i));
}
if (!IsControllerActivated()) {
@@ -344,8 +346,7 @@ void Controller_NPad::OnRelease() {
}
for (std::size_t i = 0; i < styleset_changed_events.size(); ++i) {
styleset_changed_events[i]->Close();
styleset_changed_events[i] = nullptr;
service_context.CloseEvent(styleset_changed_events[i]);
}
}

View File

@@ -20,6 +20,10 @@ class KEvent;
class KReadableEvent;
} // namespace Kernel
namespace Service::KernelHelpers {
class ServiceContext;
}
namespace Service::HID {
constexpr u32 NPAD_HANDHELD = 32;
@@ -27,7 +31,8 @@ constexpr u32 NPAD_UNKNOWN = 16; // TODO(ogniK): What is this?
class Controller_NPad final : public ControllerBase {
public:
explicit Controller_NPad(Core::System& system_);
explicit Controller_NPad(Core::System& system_,
KernelHelpers::ServiceContext& service_context_);
~Controller_NPad() override;
// Called when the controller is initialized
@@ -566,6 +571,7 @@ private:
std::array<std::unique_ptr<Input::MotionDevice>, Settings::NativeMotion::NUM_MOTIONS_HID>,
10>;
KernelHelpers::ServiceContext& service_context;
std::mutex mutex;
ButtonArray buttons;
StickArray sticks;

View File

@@ -46,8 +46,9 @@ constexpr auto pad_update_ns = std::chrono::nanoseconds{1000 * 1000}; //
constexpr auto motion_update_ns = std::chrono::nanoseconds{15 * 1000 * 1000}; // (15ms, 66.666Hz)
constexpr std::size_t SHARED_MEMORY_SIZE = 0x40000;
IAppletResource::IAppletResource(Core::System& system_)
: ServiceFramework{system_, "IAppletResource"} {
IAppletResource::IAppletResource(Core::System& system_,
KernelHelpers::ServiceContext& service_context_)
: ServiceFramework{system_, "IAppletResource"}, service_context{service_context_} {
static const FunctionInfo functions[] = {
{0, &IAppletResource::GetSharedMemoryHandle, "GetSharedMemoryHandle"},
};
@@ -63,7 +64,7 @@ IAppletResource::IAppletResource(Core::System& system_)
MakeController<Controller_Stubbed>(HidController::CaptureButton);
MakeController<Controller_Stubbed>(HidController::InputDetector);
MakeController<Controller_Stubbed>(HidController::UniquePad);
MakeController<Controller_NPad>(HidController::NPad);
MakeControllerWithServiceContext<Controller_NPad>(HidController::NPad);
MakeController<Controller_Gesture>(HidController::Gesture);
MakeController<Controller_ConsoleSixAxis>(HidController::ConsoleSixAxisSensor);
@@ -191,13 +192,14 @@ private:
std::shared_ptr<IAppletResource> Hid::GetAppletResource() {
if (applet_resource == nullptr) {
applet_resource = std::make_shared<IAppletResource>(system);
applet_resource = std::make_shared<IAppletResource>(system, service_context);
}
return applet_resource;
}
Hid::Hid(Core::System& system_) : ServiceFramework{system_, "hid"} {
Hid::Hid(Core::System& system_)
: ServiceFramework{system_, "hid"}, service_context{system_, service_name} {
// clang-format off
static const FunctionInfo functions[] = {
{0, &Hid::CreateAppletResource, "CreateAppletResource"},
@@ -347,7 +349,7 @@ void Hid::CreateAppletResource(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_HID, "called, applet_resource_user_id={}", applet_resource_user_id);
if (applet_resource == nullptr) {
applet_resource = std::make_shared<IAppletResource>(system);
applet_resource = std::make_shared<IAppletResource>(system, service_context);
}
IPC::ResponseBuilder rb{ctx, 2, 0, 1};

View File

@@ -7,6 +7,7 @@
#include <chrono>
#include "core/hle/service/hid/controllers/controller_base.h"
#include "core/hle/service/kernel_helpers.h"
#include "core/hle/service/service.h"
namespace Core::Timing {
@@ -39,7 +40,8 @@ enum class HidController : std::size_t {
class IAppletResource final : public ServiceFramework<IAppletResource> {
public:
explicit IAppletResource(Core::System& system_);
explicit IAppletResource(Core::System& system_,
KernelHelpers::ServiceContext& service_context_);
~IAppletResource() override;
void ActivateController(HidController controller);
@@ -60,11 +62,18 @@ private:
void MakeController(HidController controller) {
controllers[static_cast<std::size_t>(controller)] = std::make_unique<T>(system);
}
template <typename T>
void MakeControllerWithServiceContext(HidController controller) {
controllers[static_cast<std::size_t>(controller)] =
std::make_unique<T>(system, service_context);
}
void GetSharedMemoryHandle(Kernel::HLERequestContext& ctx);
void UpdateControllers(std::uintptr_t user_data, std::chrono::nanoseconds ns_late);
void UpdateMotion(std::uintptr_t user_data, std::chrono::nanoseconds ns_late);
KernelHelpers::ServiceContext& service_context;
std::shared_ptr<Core::Timing::EventType> pad_update_event;
std::shared_ptr<Core::Timing::EventType> motion_update_event;
@@ -176,6 +185,8 @@ private:
static_assert(sizeof(VibrationDeviceInfo) == 0x8, "VibrationDeviceInfo has incorrect size.");
std::shared_ptr<IAppletResource> applet_resource;
KernelHelpers::ServiceContext service_context;
};
/// Reload input devices. Used when input configuration changed

View File

@@ -0,0 +1,62 @@
// Copyright 2021 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/core.h"
#include "core/hle/kernel/k_event.h"
#include "core/hle/kernel/k_process.h"
#include "core/hle/kernel/k_readable_event.h"
#include "core/hle/kernel/k_resource_limit.h"
#include "core/hle/kernel/k_scoped_resource_reservation.h"
#include "core/hle/kernel/k_writable_event.h"
#include "core/hle/service/kernel_helpers.h"
namespace Service::KernelHelpers {
ServiceContext::ServiceContext(Core::System& system_, std::string name_)
: kernel(system_.Kernel()) {
process = Kernel::KProcess::Create(kernel);
ASSERT(Kernel::KProcess::Initialize(process, system_, std::move(name_),
Kernel::KProcess::ProcessType::Userland)
.IsSuccess());
}
ServiceContext::~ServiceContext() {
process->Close();
process = nullptr;
}
Kernel::KEvent* ServiceContext::CreateEvent(std::string&& name) {
// Reserve a new event from the process resource limit
Kernel::KScopedResourceReservation event_reservation(process,
Kernel::LimitableResource::Events);
if (!event_reservation.Succeeded()) {
LOG_CRITICAL(Service, "Resource limit reached!");
return {};
}
// Create a new event.
auto* event = Kernel::KEvent::Create(kernel);
if (!event) {
LOG_CRITICAL(Service, "Unable to create event!");
return {};
}
// Initialize the event.
event->Initialize(std::move(name));
// Commit the thread reservation.
event_reservation.Commit();
// Register the event.
Kernel::KEvent::Register(kernel, event);
return event;
}
void ServiceContext::CloseEvent(Kernel::KEvent* event) {
event->GetReadableEvent().Close();
event->GetWritableEvent().Close();
}
} // namespace Service::KernelHelpers

View File

@@ -0,0 +1,35 @@
// Copyright 2021 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <string>
namespace Core {
class System;
}
namespace Kernel {
class KernelCore;
class KEvent;
class KProcess;
} // namespace Kernel
namespace Service::KernelHelpers {
class ServiceContext {
public:
ServiceContext(Core::System& system_, std::string name_);
~ServiceContext();
Kernel::KEvent* CreateEvent(std::string&& name);
void CloseEvent(Kernel::KEvent* event);
private:
Kernel::KernelCore& kernel;
Kernel::KProcess* process{};
};
} // namespace Service::KernelHelpers

View File

@@ -339,13 +339,16 @@ std::optional<ApplicationLanguage> ConvertToApplicationLanguage(
case Set::LanguageCode::FR_CA:
return ApplicationLanguage::CanadianFrench;
case Set::LanguageCode::PT:
case Set::LanguageCode::PT_BR:
return ApplicationLanguage::Portuguese;
case Set::LanguageCode::RU:
return ApplicationLanguage::Russian;
case Set::LanguageCode::KO:
return ApplicationLanguage::Korean;
case Set::LanguageCode::ZH_TW:
case Set::LanguageCode::ZH_HANT:
return ApplicationLanguage::TraditionalChinese;
case Set::LanguageCode::ZH_CN:
case Set::LanguageCode::ZH_HANS:
return ApplicationLanguage::SimplifiedChinese;
default:

View File

@@ -1,42 +0,0 @@
// Copyright 2019 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <optional>
#include <string>
#include "common/common_types.h"
#include "core/hle/service/set/set.h"
namespace Service::NS {
/// This is nn::ns::detail::ApplicationLanguage
enum class ApplicationLanguage : u8 {
AmericanEnglish = 0,
BritishEnglish,
Japanese,
French,
German,
LatinAmericanSpanish,
Spanish,
Italian,
Dutch,
CanadianFrench,
Portuguese,
Russian,
Korean,
TraditionalChinese,
SimplifiedChinese,
Count
};
using ApplicationLanguagePriorityList =
const std::array<ApplicationLanguage, static_cast<std::size_t>(ApplicationLanguage::Count)>;
constexpr u32 GetSupportedLanguageFlag(const ApplicationLanguage lang) {
return 1U << static_cast<u32>(lang);
}
const ApplicationLanguagePriorityList* GetApplicationLanguagePriorityList(ApplicationLanguage lang);
std::optional<ApplicationLanguage> ConvertToApplicationLanguage(
Service::Set::LanguageCode language_code);
std::optional<Service::Set::LanguageCode> ConvertToLanguageCode(ApplicationLanguage lang);
} // namespace Service::NS

View File

@@ -54,7 +54,7 @@ void nvdisp_disp0::flip(u32 buffer_handle, u32 offset, u32 format, u32 width, u3
system.GetPerfStats().EndSystemFrame();
system.GPU().SwapBuffers(&framebuffer);
system.FrameLimiter().DoFrameLimiting(system.CoreTiming().GetGlobalTimeUs());
system.SpeedLimiter().DoSpeedLimiting(system.CoreTiming().GetGlobalTimeUs());
system.GetPerfStats().BeginSystemFrame();
}

View File

@@ -166,8 +166,6 @@ NvResult nvhost_nvdec_common::MapBuffer(const std::vector<u8>& input, std::vecto
LOG_ERROR(Service_NVDRV, "failed to map size={}", object->size);
} else {
cmd_buffer.map_address = object->dma_map_addr;
AddBufferMap(object->dma_map_addr, object->size, object->addr,
object->status == nvmap::Object::Status::Allocated);
}
}
std::memcpy(output.data(), &params, sizeof(IoctlMapBuffer));
@@ -178,30 +176,11 @@ NvResult nvhost_nvdec_common::MapBuffer(const std::vector<u8>& input, std::vecto
}
NvResult nvhost_nvdec_common::UnmapBuffer(const std::vector<u8>& input, std::vector<u8>& output) {
IoctlMapBuffer params{};
std::memcpy(&params, input.data(), sizeof(IoctlMapBuffer));
std::vector<MapBufferEntry> cmd_buffer_handles(params.num_entries);
SliceVectors(input, cmd_buffer_handles, params.num_entries, sizeof(IoctlMapBuffer));
auto& gpu = system.GPU();
for (auto& cmd_buffer : cmd_buffer_handles) {
const auto object{nvmap_dev->GetObject(cmd_buffer.map_handle)};
if (!object) {
LOG_ERROR(Service_NVDRV, "invalid cmd_buffer nvmap_handle={:X}", cmd_buffer.map_handle);
std::memcpy(output.data(), &params, output.size());
return NvResult::InvalidState;
}
if (const auto size{RemoveBufferMap(object->dma_map_addr)}; size) {
gpu.MemoryManager().Unmap(object->dma_map_addr, *size);
} else {
// This occurs quite frequently, however does not seem to impact functionality
LOG_DEBUG(Service_NVDRV, "invalid offset=0x{:X} dma=0x{:X}", object->addr,
object->dma_map_addr);
}
object->dma_map_addr = 0;
}
// This is intntionally stubbed.
// Skip unmapping buffers here, as to not break the continuity of the VP9 reference frame
// addresses, and risk invalidating data before the async GPU thread is done with it
std::memset(output.data(), 0, output.size());
LOG_DEBUG(Service_NVDRV, "(STUBBED) called");
return NvResult::Success;
}
@@ -212,33 +191,4 @@ NvResult nvhost_nvdec_common::SetSubmitTimeout(const std::vector<u8>& input,
return NvResult::Success;
}
std::optional<nvhost_nvdec_common::BufferMap> nvhost_nvdec_common::FindBufferMap(
GPUVAddr gpu_addr) const {
const auto it = std::find_if(
buffer_mappings.begin(), buffer_mappings.upper_bound(gpu_addr), [&](const auto& entry) {
return (gpu_addr >= entry.second.StartAddr() && gpu_addr < entry.second.EndAddr());
});
ASSERT(it != buffer_mappings.end());
return it->second;
}
void nvhost_nvdec_common::AddBufferMap(GPUVAddr gpu_addr, std::size_t size, VAddr cpu_addr,
bool is_allocated) {
buffer_mappings.insert_or_assign(gpu_addr, BufferMap{gpu_addr, size, cpu_addr, is_allocated});
}
std::optional<std::size_t> nvhost_nvdec_common::RemoveBufferMap(GPUVAddr gpu_addr) {
const auto iter{buffer_mappings.find(gpu_addr)};
if (iter == buffer_mappings.end()) {
return std::nullopt;
}
std::size_t size = 0;
if (iter->second.IsAllocated()) {
size = iter->second.Size();
}
buffer_mappings.erase(iter);
return size;
}
} // namespace Service::Nvidia::Devices

View File

@@ -23,45 +23,6 @@ public:
~nvhost_nvdec_common() override;
protected:
class BufferMap final {
public:
constexpr BufferMap() = default;
constexpr BufferMap(GPUVAddr start_addr_, std::size_t size_)
: start_addr{start_addr_}, end_addr{start_addr_ + size_} {}
constexpr BufferMap(GPUVAddr start_addr_, std::size_t size_, VAddr cpu_addr_,
bool is_allocated_)
: start_addr{start_addr_}, end_addr{start_addr_ + size_}, cpu_addr{cpu_addr_},
is_allocated{is_allocated_} {}
constexpr VAddr StartAddr() const {
return start_addr;
}
constexpr VAddr EndAddr() const {
return end_addr;
}
constexpr std::size_t Size() const {
return end_addr - start_addr;
}
constexpr VAddr CpuAddr() const {
return cpu_addr;
}
constexpr bool IsAllocated() const {
return is_allocated;
}
private:
GPUVAddr start_addr{};
GPUVAddr end_addr{};
VAddr cpu_addr{};
bool is_allocated{};
};
struct IoctlSetNvmapFD {
s32_le nvmap_fd{};
};
@@ -154,17 +115,11 @@ protected:
NvResult UnmapBuffer(const std::vector<u8>& input, std::vector<u8>& output);
NvResult SetSubmitTimeout(const std::vector<u8>& input, std::vector<u8>& output);
std::optional<BufferMap> FindBufferMap(GPUVAddr gpu_addr) const;
void AddBufferMap(GPUVAddr gpu_addr, std::size_t size, VAddr cpu_addr, bool is_allocated);
std::optional<std::size_t> RemoveBufferMap(GPUVAddr gpu_addr);
s32_le nvmap_fd{};
u32_le submit_timeout{};
std::shared_ptr<nvmap> nvmap_dev;
SyncpointManager& syncpoint_manager;
std::array<u32, MaxSyncPoints> device_syncpoints{};
// This is expected to be ordered, therefore we must use a map, not unordered_map
std::map<GPUVAddr, BufferMap> buffer_mappings;
};
}; // namespace Devices
} // namespace Service::Nvidia

View File

@@ -39,11 +39,11 @@ void InstallInterfaces(SM::ServiceManager& service_manager, NVFlinger::NVFlinger
nvflinger.SetNVDrvInstance(module_);
}
Module::Module(Core::System& system) : syncpoint_manager{system.GPU()} {
auto& kernel = system.Kernel();
Module::Module(Core::System& system)
: syncpoint_manager{system.GPU()}, service_context{system, "nvdrv"} {
for (u32 i = 0; i < MaxNvEvents; i++) {
events_interface.events[i].event = Kernel::KEvent::Create(kernel);
events_interface.events[i].event->Initialize(fmt::format("NVDRV::NvEvent_{}", i));
events_interface.events[i].event =
service_context.CreateEvent(fmt::format("NVDRV::NvEvent_{}", i));
events_interface.status[i] = EventState::Free;
events_interface.registered[i] = false;
}
@@ -65,8 +65,7 @@ Module::Module(Core::System& system) : syncpoint_manager{system.GPU()} {
Module::~Module() {
for (u32 i = 0; i < MaxNvEvents; i++) {
events_interface.events[i].event->Close();
events_interface.events[i].event = nullptr;
service_context.CloseEvent(events_interface.events[i].event);
}
}

View File

@@ -9,6 +9,7 @@
#include <vector>
#include "common/common_types.h"
#include "core/hle/service/kernel_helpers.h"
#include "core/hle/service/nvdrv/nvdata.h"
#include "core/hle/service/nvdrv/syncpoint_manager.h"
#include "core/hle/service/service.h"
@@ -154,6 +155,8 @@ private:
std::unordered_map<std::string, std::shared_ptr<Devices::nvdevice>> devices;
EventInterface events_interface;
KernelHelpers::ServiceContext service_context;
};
/// Registers all NVDRV services with the specified service manager.

View File

@@ -307,11 +307,12 @@ void NVFlinger::Compose() {
}
s64 NVFlinger::GetNextTicks() const {
if (Settings::values.disable_fps_limit.GetValue()) {
return 0;
}
constexpr s64 max_hertz = 120LL;
return (1000000000 * (1LL << swap_interval)) / max_hertz;
static constexpr s64 max_hertz = 120LL;
const auto& settings = Settings::values;
const bool unlocked_fps = settings.disable_fps_limit.GetValue();
const s64 fps_cap = unlocked_fps ? static_cast<s64>(settings.fps_cap.GetValue()) : 1;
return (1000000000 * (1LL << swap_interval)) / (max_hertz * fps_cap);
}
} // namespace Service::NVFlinger

View File

@@ -104,23 +104,22 @@ ServiceFrameworkBase::~ServiceFrameworkBase() {
void ServiceFrameworkBase::InstallAsService(SM::ServiceManager& service_manager) {
const auto guard = LockService();
ASSERT(!port_installed);
ASSERT(!service_registered);
auto port = service_manager.RegisterService(service_name, max_sessions).Unwrap();
port->SetSessionHandler(shared_from_this());
port_installed = true;
service_manager.RegisterService(service_name, max_sessions, shared_from_this());
service_registered = true;
}
Kernel::KClientPort& ServiceFrameworkBase::CreatePort() {
const auto guard = LockService();
ASSERT(!port_installed);
ASSERT(!service_registered);
auto* port = Kernel::KPort::Create(kernel);
port->Initialize(max_sessions, false, service_name);
port->GetServerPort().SetSessionHandler(shared_from_this());
port_installed = true;
service_registered = true;
return port->GetClientPort();
}

View File

@@ -96,6 +96,9 @@ protected:
/// System context that the service operates under.
Core::System& system;
/// Identifier string used to connect to the service.
std::string service_name;
private:
template <typename T>
friend class ServiceFramework;
@@ -117,14 +120,12 @@ private:
void RegisterHandlersBaseTipc(const FunctionInfoBase* functions, std::size_t n);
void ReportUnimplementedFunction(Kernel::HLERequestContext& ctx, const FunctionInfoBase* info);
/// Identifier string used to connect to the service.
std::string service_name;
/// Maximum number of concurrent sessions that this service can handle.
u32 max_sessions;
/// Flag to store if a port was already create/installed to detect multiple install attempts,
/// which is not supported.
bool port_installed = false;
bool service_registered = false;
/// Function used to safely up-cast pointers to the derived class before invoking a handler.
InvokerFn* handler_invoker;

View File

@@ -12,7 +12,7 @@
namespace Service::Set {
namespace {
constexpr std::array<LanguageCode, 17> available_language_codes = {{
constexpr std::array<LanguageCode, 18> available_language_codes = {{
LanguageCode::JA,
LanguageCode::EN_US,
LanguageCode::FR,
@@ -30,6 +30,7 @@ constexpr std::array<LanguageCode, 17> available_language_codes = {{
LanguageCode::ES_419,
LanguageCode::ZH_HANS,
LanguageCode::ZH_HANT,
LanguageCode::PT_BR,
}};
enum class KeyboardLayout : u64 {
@@ -50,7 +51,7 @@ enum class KeyboardLayout : u64 {
ChineseTraditional = 14,
};
constexpr std::array<std::pair<LanguageCode, KeyboardLayout>, 17> language_to_layout{{
constexpr std::array<std::pair<LanguageCode, KeyboardLayout>, 18> language_to_layout{{
{LanguageCode::JA, KeyboardLayout::Japanese},
{LanguageCode::EN_US, KeyboardLayout::EnglishUs},
{LanguageCode::FR, KeyboardLayout::French},
@@ -68,10 +69,11 @@ constexpr std::array<std::pair<LanguageCode, KeyboardLayout>, 17> language_to_la
{LanguageCode::ES_419, KeyboardLayout::SpanishLatin},
{LanguageCode::ZH_HANS, KeyboardLayout::ChineseSimplified},
{LanguageCode::ZH_HANT, KeyboardLayout::ChineseTraditional},
{LanguageCode::PT_BR, KeyboardLayout::Portuguese},
}};
constexpr std::size_t pre4_0_0_max_entries = 15;
constexpr std::size_t post4_0_0_max_entries = 17;
constexpr std::size_t PRE_4_0_0_MAX_ENTRIES = 0xF;
constexpr std::size_t POST_4_0_0_MAX_ENTRIES = 0x40;
constexpr ResultCode ERR_INVALID_LANGUAGE{ErrorModule::Settings, 625};
@@ -81,9 +83,10 @@ void PushResponseLanguageCode(Kernel::HLERequestContext& ctx, std::size_t num_la
rb.Push(static_cast<u32>(num_language_codes));
}
void GetAvailableLanguageCodesImpl(Kernel::HLERequestContext& ctx, std::size_t max_size) {
void GetAvailableLanguageCodesImpl(Kernel::HLERequestContext& ctx, std::size_t max_entries) {
const std::size_t requested_amount = ctx.GetWriteBufferSize() / sizeof(LanguageCode);
const std::size_t copy_amount = std::min(requested_amount, max_size);
const std::size_t max_amount = std::min(requested_amount, max_entries);
const std::size_t copy_amount = std::min(available_language_codes.size(), max_amount);
const std::size_t copy_size = copy_amount * sizeof(LanguageCode);
ctx.WriteBuffer(available_language_codes.data(), copy_size);
@@ -118,7 +121,7 @@ LanguageCode GetLanguageCodeFromIndex(std::size_t index) {
void SET::GetAvailableLanguageCodes(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_SET, "called");
GetAvailableLanguageCodesImpl(ctx, pre4_0_0_max_entries);
GetAvailableLanguageCodesImpl(ctx, PRE_4_0_0_MAX_ENTRIES);
}
void SET::MakeLanguageCode(Kernel::HLERequestContext& ctx) {
@@ -140,19 +143,19 @@ void SET::MakeLanguageCode(Kernel::HLERequestContext& ctx) {
void SET::GetAvailableLanguageCodes2(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_SET, "called");
GetAvailableLanguageCodesImpl(ctx, post4_0_0_max_entries);
GetAvailableLanguageCodesImpl(ctx, POST_4_0_0_MAX_ENTRIES);
}
void SET::GetAvailableLanguageCodeCount(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_SET, "called");
PushResponseLanguageCode(ctx, pre4_0_0_max_entries);
PushResponseLanguageCode(ctx, PRE_4_0_0_MAX_ENTRIES);
}
void SET::GetAvailableLanguageCodeCount2(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_SET, "called");
PushResponseLanguageCode(ctx, post4_0_0_max_entries);
PushResponseLanguageCode(ctx, POST_4_0_0_MAX_ENTRIES);
}
void SET::GetQuestFlag(Kernel::HLERequestContext& ctx) {

View File

@@ -31,6 +31,7 @@ enum class LanguageCode : u64 {
ES_419 = 0x00003931342D7365,
ZH_HANS = 0x00736E61482D687A,
ZH_HANT = 0x00746E61482D687A,
PT_BR = 0x00000052422D7470,
};
LanguageCode GetLanguageCodeFromIndex(std::size_t idx);

View File

@@ -4,6 +4,7 @@
#include <tuple>
#include "common/assert.h"
#include "common/scope_exit.h"
#include "core/core.h"
#include "core/hle/ipc_helpers.h"
#include "core/hle/kernel/k_client_port.h"
@@ -40,17 +41,13 @@ static ResultCode ValidateServiceName(const std::string& name) {
}
Kernel::KClientPort& ServiceManager::InterfaceFactory(ServiceManager& self, Core::System& system) {
ASSERT(self.sm_interface.expired());
auto sm = std::make_shared<SM>(self, system);
self.sm_interface = sm;
self.sm_interface = std::make_shared<SM>(self, system);
self.controller_interface = std::make_unique<Controller>(system);
return sm->CreatePort();
return self.sm_interface->CreatePort();
}
ResultVal<Kernel::KServerPort*> ServiceManager::RegisterService(std::string name,
u32 max_sessions) {
ResultCode ServiceManager::RegisterService(std::string name, u32 max_sessions,
Kernel::SessionRequestHandlerPtr handler) {
CASCADE_CODE(ValidateServiceName(name));
@@ -59,12 +56,9 @@ ResultVal<Kernel::KServerPort*> ServiceManager::RegisterService(std::string name
return ERR_ALREADY_REGISTERED;
}
auto* port = Kernel::KPort::Create(kernel);
port->Initialize(max_sessions, false, name);
registered_services.emplace(std::move(name), handler);
registered_services.emplace(std::move(name), port);
return MakeResult(&port->GetServerPort());
return ResultSuccess;
}
ResultCode ServiceManager::UnregisterService(const std::string& name) {
@@ -76,14 +70,11 @@ ResultCode ServiceManager::UnregisterService(const std::string& name) {
return ERR_SERVICE_NOT_REGISTERED;
}
iter->second->Close();
registered_services.erase(iter);
return ResultSuccess;
}
ResultVal<Kernel::KPort*> ServiceManager::GetServicePort(const std::string& name) {
CASCADE_CODE(ValidateServiceName(name));
auto it = registered_services.find(name);
if (it == registered_services.end()) {
@@ -91,10 +82,13 @@ ResultVal<Kernel::KPort*> ServiceManager::GetServicePort(const std::string& name
return ERR_SERVICE_NOT_REGISTERED;
}
return MakeResult(it->second);
}
auto* port = Kernel::KPort::Create(kernel);
port->Initialize(ServerSessionCountMax, false, name);
auto handler = it->second;
port->GetServerPort().SetSessionHandler(std::move(handler));
SM::~SM() = default;
return MakeResult(port);
}
/**
* SM::Initialize service function
@@ -156,11 +150,15 @@ ResultVal<Kernel::KClientSession*> SM::GetServiceImpl(Kernel::HLERequestContext&
LOG_ERROR(Service_SM, "called service={} -> error 0x{:08X}", name, port_result.Code().raw);
return port_result.Code();
}
auto& port = port_result.Unwrap()->GetClientPort();
auto& port = port_result.Unwrap();
SCOPE_EXIT({ port->GetClientPort().Close(); });
server_ports.emplace_back(&port->GetServerPort());
// Create a new session.
Kernel::KClientSession* session{};
if (const auto result = port.CreateSession(std::addressof(session)); result.IsError()) {
if (const auto result = port->GetClientPort().CreateSession(std::addressof(session));
result.IsError()) {
LOG_ERROR(Service_SM, "called service={} -> error 0x{:08X}", name, result.raw);
return result;
}
@@ -180,20 +178,21 @@ void SM::RegisterService(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_SM, "called with name={}, max_session_count={}, is_light={}", name,
max_session_count, is_light);
auto handle = service_manager.RegisterService(name, max_session_count);
if (handle.Failed()) {
LOG_ERROR(Service_SM, "failed to register service with error_code={:08X}",
handle.Code().raw);
if (const auto result = service_manager.RegisterService(name, max_session_count, nullptr);
result.IsError()) {
LOG_ERROR(Service_SM, "failed to register service with error_code={:08X}", result.raw);
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(handle.Code());
rb.Push(result);
return;
}
IPC::ResponseBuilder rb{ctx, 2, 0, 1, IPC::ResponseBuilder::Flags::AlwaysMoveHandles};
rb.Push(handle.Code());
auto* port = Kernel::KPort::Create(kernel);
port->Initialize(ServerSessionCountMax, is_light, name);
SCOPE_EXIT({ port->GetClientPort().Close(); });
auto server_port = handle.Unwrap();
rb.PushMoveObjects(server_port);
IPC::ResponseBuilder rb{ctx, 2, 0, 1, IPC::ResponseBuilder::Flags::AlwaysMoveHandles};
rb.Push(ResultSuccess);
rb.PushMoveObjects(port->GetServerPort());
}
void SM::UnregisterService(Kernel::HLERequestContext& ctx) {
@@ -225,4 +224,10 @@ SM::SM(ServiceManager& service_manager_, Core::System& system_)
});
}
SM::~SM() {
for (auto& server_port : server_ports) {
server_port->Close();
}
}
} // namespace Service::SM

View File

@@ -49,6 +49,7 @@ private:
ServiceManager& service_manager;
bool is_initialized{};
Kernel::KernelCore& kernel;
std::vector<Kernel::KServerPort*> server_ports;
};
class ServiceManager {
@@ -58,7 +59,8 @@ public:
explicit ServiceManager(Kernel::KernelCore& kernel_);
~ServiceManager();
ResultVal<Kernel::KServerPort*> RegisterService(std::string name, u32 max_sessions);
ResultCode RegisterService(std::string name, u32 max_sessions,
Kernel::SessionRequestHandlerPtr handler);
ResultCode UnregisterService(const std::string& name);
ResultVal<Kernel::KPort*> GetServicePort(const std::string& name);
@@ -69,21 +71,17 @@ public:
LOG_DEBUG(Service, "Can't find service: {}", service_name);
return nullptr;
}
auto* port = service->second;
if (port == nullptr) {
return nullptr;
}
return std::static_pointer_cast<T>(port->GetServerPort().GetSessionRequestHandler());
return std::static_pointer_cast<T>(service->second);
}
void InvokeControlRequest(Kernel::HLERequestContext& context);
private:
std::weak_ptr<SM> sm_interface;
std::shared_ptr<SM> sm_interface;
std::unique_ptr<Controller> controller_interface;
/// Map of registered services, retrieved using GetServicePort.
std::unordered_map<std::string, Kernel::KPort*> registered_services;
std::unordered_map<std::string, Kernel::SessionRequestHandlerPtr> registered_services;
/// Kernel context
Kernel::KernelCore& kernel;

View File

@@ -570,7 +570,7 @@ std::pair<s32, Errno> Socket::SendTo(u32 flags, const std::vector<u8>& message,
ASSERT(flags == 0);
const sockaddr* to = nullptr;
const int tolen = addr ? 0 : sizeof(sockaddr);
const int tolen = addr ? sizeof(sockaddr) : 0;
sockaddr host_addr_in;
if (addr) {

View File

@@ -127,15 +127,15 @@ double PerfStats::GetLastFrameTimeScale() const {
return duration_cast<DoubleSecs>(previous_frame_length).count() / FRAME_LENGTH;
}
void FrameLimiter::DoFrameLimiting(microseconds current_system_time_us) {
if (!Settings::values.use_frame_limit.GetValue() ||
void SpeedLimiter::DoSpeedLimiting(microseconds current_system_time_us) {
if (!Settings::values.use_speed_limit.GetValue() ||
Settings::values.use_multi_core.GetValue()) {
return;
}
auto now = Clock::now();
const double sleep_scale = Settings::values.frame_limit.GetValue() / 100.0;
const double sleep_scale = Settings::values.speed_limit.GetValue() / 100.0;
// Max lag caused by slow frames. Shouldn't be more than the length of a frame at the current
// speed percent or it will clamp too much and prevent this from properly limiting to that
@@ -143,17 +143,17 @@ void FrameLimiter::DoFrameLimiting(microseconds current_system_time_us) {
// limiting
const microseconds max_lag_time_us = duration_cast<microseconds>(
std::chrono::duration<double, std::chrono::microseconds::period>(25ms / sleep_scale));
frame_limiting_delta_err += duration_cast<microseconds>(
speed_limiting_delta_err += duration_cast<microseconds>(
std::chrono::duration<double, std::chrono::microseconds::period>(
(current_system_time_us - previous_system_time_us) / sleep_scale));
frame_limiting_delta_err -= duration_cast<microseconds>(now - previous_walltime);
frame_limiting_delta_err =
std::clamp(frame_limiting_delta_err, -max_lag_time_us, max_lag_time_us);
speed_limiting_delta_err -= duration_cast<microseconds>(now - previous_walltime);
speed_limiting_delta_err =
std::clamp(speed_limiting_delta_err, -max_lag_time_us, max_lag_time_us);
if (frame_limiting_delta_err > microseconds::zero()) {
std::this_thread::sleep_for(frame_limiting_delta_err);
if (speed_limiting_delta_err > microseconds::zero()) {
std::this_thread::sleep_for(speed_limiting_delta_err);
auto now_after_sleep = Clock::now();
frame_limiting_delta_err -= duration_cast<microseconds>(now_after_sleep - now);
speed_limiting_delta_err -= duration_cast<microseconds>(now_after_sleep - now);
now = now_after_sleep;
}

View File

@@ -85,11 +85,11 @@ private:
double previous_fps = 0;
};
class FrameLimiter {
class SpeedLimiter {
public:
using Clock = std::chrono::high_resolution_clock;
void DoFrameLimiting(std::chrono::microseconds current_system_time_us);
void DoSpeedLimiting(std::chrono::microseconds current_system_time_us);
private:
/// Emulated system time (in microseconds) at the last limiter invocation
@@ -98,7 +98,7 @@ private:
Clock::time_point previous_walltime = Clock::now();
/// Accumulated difference between walltime and emulated time
std::chrono::microseconds frame_limiting_delta_err{0};
std::chrono::microseconds speed_limiting_delta_err{0};
};
} // namespace Core

View File

@@ -62,7 +62,6 @@ json GetYuzuVersionData() {
{"build_date", std::string(Common::g_build_date)},
{"build_fullname", std::string(Common::g_build_fullname)},
{"build_version", std::string(Common::g_build_version)},
{"shader_cache_version", std::string(Common::g_shader_cache_version)},
};
}

View File

@@ -221,8 +221,8 @@ void TelemetrySession::AddInitialInfo(Loader::AppLoader& app_loader,
TranslateRenderer(Settings::values.renderer_backend.GetValue()));
AddField(field_type, "Renderer_ResolutionFactor",
Settings::values.resolution_factor.GetValue());
AddField(field_type, "Renderer_UseFrameLimit", Settings::values.use_frame_limit.GetValue());
AddField(field_type, "Renderer_FrameLimit", Settings::values.frame_limit.GetValue());
AddField(field_type, "Renderer_UseSpeedLimit", Settings::values.use_speed_limit.GetValue());
AddField(field_type, "Renderer_SpeedLimit", Settings::values.speed_limit.GetValue());
AddField(field_type, "Renderer_UseDiskShaderCache",
Settings::values.use_disk_shader_cache.GetValue());
AddField(field_type, "Renderer_GPUAccuracyLevel",
@@ -233,8 +233,8 @@ void TelemetrySession::AddInitialInfo(Loader::AppLoader& app_loader,
Settings::values.use_nvdec_emulation.GetValue());
AddField(field_type, "Renderer_AccelerateASTC", Settings::values.accelerate_astc.GetValue());
AddField(field_type, "Renderer_UseVsync", Settings::values.use_vsync.GetValue());
AddField(field_type, "Renderer_UseAssemblyShaders",
Settings::values.use_assembly_shaders.GetValue());
AddField(field_type, "Renderer_ShaderBackend",
static_cast<u32>(Settings::values.shader_backend.GetValue()));
AddField(field_type, "Renderer_UseAsynchronousShaders",
Settings::values.use_asynchronous_shaders.GetValue());
AddField(field_type, "System_UseDockedMode", Settings::values.use_docked_mode.GetValue());

View File

@@ -304,10 +304,10 @@ std::vector<std::unique_ptr<Polling::DevicePoller>> InputSubsystem::GetPollers([
}
std::string GenerateKeyboardParam(int key_code) {
Common::ParamPackage param{
{"engine", "keyboard"},
{"code", std::to_string(key_code)},
};
Common::ParamPackage param;
param.Set("engine", "keyboard");
param.Set("code", key_code);
param.Set("toggle", false);
return param.Serialize();
}

View File

@@ -57,6 +57,7 @@ Common::ParamPackage MouseButtonFactory::GetNextInput() const {
if (pad.button != MouseInput::MouseButton::Undefined) {
params.Set("engine", "mouse");
params.Set("button", static_cast<u16>(pad.button));
params.Set("toggle", false);
return params;
}
}

View File

@@ -82,6 +82,12 @@ public:
state.buttons.insert_or_assign(button, value);
}
void PreSetButton(int button) {
if (!state.buttons.contains(button)) {
SetButton(button, false);
}
}
void SetMotion(SDL_ControllerSensorEvent event) {
constexpr float gravity_constant = 9.80665f;
std::lock_guard lock{mutex};
@@ -155,9 +161,16 @@ public:
state.axes.insert_or_assign(axis, value);
}
float GetAxis(int axis, float range) const {
void PreSetAxis(int axis) {
if (!state.axes.contains(axis)) {
SetAxis(axis, 0);
}
}
float GetAxis(int axis, float range, float offset) const {
std::lock_guard lock{mutex};
return static_cast<float>(state.axes.at(axis)) / (32767.0f * range);
const float value = static_cast<float>(state.axes.at(axis)) / 32767.0f;
return (value + offset) / range;
}
bool RumblePlay(u16 amp_low, u16 amp_high) {
@@ -174,9 +187,10 @@ public:
return false;
}
std::tuple<float, float> GetAnalog(int axis_x, int axis_y, float range) const {
float x = GetAxis(axis_x, range);
float y = GetAxis(axis_y, range);
std::tuple<float, float> GetAnalog(int axis_x, int axis_y, float range, float offset_x,
float offset_y) const {
float x = GetAxis(axis_x, range, offset_x);
float y = GetAxis(axis_y, range, offset_y);
y = -y; // 3DS uses an y-axis inverse from SDL
// Make sure the coordinates are in the unit circle,
@@ -483,7 +497,7 @@ public:
trigger_if_greater(trigger_if_greater_) {}
bool GetStatus() const override {
const float axis_value = joystick->GetAxis(axis, 1.0f);
const float axis_value = joystick->GetAxis(axis, 1.0f, 0.0f);
if (trigger_if_greater) {
return axis_value > threshold;
}
@@ -500,12 +514,14 @@ private:
class SDLAnalog final : public Input::AnalogDevice {
public:
explicit SDLAnalog(std::shared_ptr<SDLJoystick> joystick_, int axis_x_, int axis_y_,
bool invert_x_, bool invert_y_, float deadzone_, float range_)
bool invert_x_, bool invert_y_, float deadzone_, float range_,
float offset_x_, float offset_y_)
: joystick(std::move(joystick_)), axis_x(axis_x_), axis_y(axis_y_), invert_x(invert_x_),
invert_y(invert_y_), deadzone(deadzone_), range(range_) {}
invert_y(invert_y_), deadzone(deadzone_), range(range_), offset_x(offset_x_),
offset_y(offset_y_) {}
std::tuple<float, float> GetStatus() const override {
auto [x, y] = joystick->GetAnalog(axis_x, axis_y, range);
auto [x, y] = joystick->GetAnalog(axis_x, axis_y, range, offset_x, offset_y);
const float r = std::sqrt((x * x) + (y * y));
if (invert_x) {
x = -x;
@@ -522,8 +538,8 @@ public:
}
std::tuple<float, float> GetRawStatus() const override {
const float x = joystick->GetAxis(axis_x, range);
const float y = joystick->GetAxis(axis_y, range);
const float x = joystick->GetAxis(axis_x, range, offset_x);
const float y = joystick->GetAxis(axis_y, range, offset_y);
return {x, -y};
}
@@ -555,6 +571,8 @@ private:
const bool invert_y;
const float deadzone;
const float range;
const float offset_x;
const float offset_y;
};
class SDLVibration final : public Input::VibrationDevice {
@@ -621,7 +639,7 @@ public:
trigger_if_greater(trigger_if_greater_) {}
Input::MotionStatus GetStatus() const override {
const float axis_value = joystick->GetAxis(axis, 1.0f);
const float axis_value = joystick->GetAxis(axis, 1.0f, 0.0f);
bool trigger = axis_value < threshold;
if (trigger_if_greater) {
trigger = axis_value > threshold;
@@ -707,7 +725,8 @@ public:
if (params.Has("axis")) {
const int axis = params.Get("axis", 0);
const float threshold = params.Get("threshold", 0.5f);
// Convert range from (0.0, 1.0) to (-1.0, 1.0)
const float threshold = (params.Get("threshold", 0.5f) - 0.5f) * 2.0f;
const std::string direction_name = params.Get("direction", "");
bool trigger_if_greater;
if (direction_name == "+") {
@@ -719,13 +738,13 @@ public:
LOG_ERROR(Input, "Unknown direction {}", direction_name);
}
// This is necessary so accessing GetAxis with axis won't crash
joystick->SetAxis(axis, 0);
joystick->PreSetAxis(axis);
return std::make_unique<SDLAxisButton>(joystick, axis, threshold, trigger_if_greater);
}
const int button = params.Get("button", 0);
// This is necessary so accessing GetButton with button won't crash
joystick->SetButton(button, false);
joystick->PreSetButton(button);
return std::make_unique<SDLButton>(joystick, button, toggle);
}
@@ -756,13 +775,15 @@ public:
const std::string invert_y_value = params.Get("invert_y", "+");
const bool invert_x = invert_x_value == "-";
const bool invert_y = invert_y_value == "-";
const float offset_x = params.Get("offset_x", 0.0f);
const float offset_y = params.Get("offset_y", 0.0f);
auto joystick = state.GetSDLJoystickByGUID(guid, port);
// This is necessary so accessing GetAxis with axis_x and axis_y won't crash
joystick->SetAxis(axis_x, 0);
joystick->SetAxis(axis_y, 0);
joystick->PreSetAxis(axis_x);
joystick->PreSetAxis(axis_y);
return std::make_unique<SDLAnalog>(joystick, axis_x, axis_y, invert_x, invert_y, deadzone,
range);
range, offset_x, offset_y);
}
private:
@@ -843,13 +864,13 @@ public:
LOG_ERROR(Input, "Unknown direction {}", direction_name);
}
// This is necessary so accessing GetAxis with axis won't crash
joystick->SetAxis(axis, 0);
joystick->PreSetAxis(axis);
return std::make_unique<SDLAxisMotion>(joystick, axis, threshold, trigger_if_greater);
}
const int button = params.Get("button", 0);
// This is necessary so accessing GetButton with button won't crash
joystick->SetButton(button, false);
joystick->PreSetButton(button);
return std::make_unique<SDLButtonMotion>(joystick, button);
}
@@ -980,12 +1001,11 @@ Common::ParamPackage BuildAnalogParamPackageForButton(int port, std::string guid
params.Set("port", port);
params.Set("guid", std::move(guid));
params.Set("axis", axis);
params.Set("threshold", "0.5");
if (value > 0) {
params.Set("direction", "+");
params.Set("threshold", "0.5");
} else {
params.Set("direction", "-");
params.Set("threshold", "-0.5");
}
return params;
}
@@ -995,6 +1015,7 @@ Common::ParamPackage BuildButtonParamPackageForButton(int port, std::string guid
params.Set("port", port);
params.Set("guid", std::move(guid));
params.Set("button", button);
params.Set("toggle", false);
return params;
}
@@ -1134,13 +1155,15 @@ Common::ParamPackage BuildParamPackageForBinding(int port, const std::string& gu
}
Common::ParamPackage BuildParamPackageForAnalog(int port, const std::string& guid, int axis_x,
int axis_y) {
int axis_y, float offset_x, float offset_y) {
Common::ParamPackage params;
params.Set("engine", "sdl");
params.Set("port", port);
params.Set("guid", guid);
params.Set("axis_x", axis_x);
params.Set("axis_y", axis_y);
params.Set("offset_x", offset_x);
params.Set("offset_y", offset_y);
params.Set("invert_x", "+");
params.Set("invert_y", "+");
return params;
@@ -1342,24 +1365,39 @@ AnalogMapping SDLState::GetAnalogMappingForDevice(const Common::ParamPackage& pa
const auto& binding_left_y =
SDL_GameControllerGetBindForAxis(controller, SDL_CONTROLLER_AXIS_LEFTY);
if (params.Has("guid2")) {
joystick2->PreSetAxis(binding_left_x.value.axis);
joystick2->PreSetAxis(binding_left_y.value.axis);
const auto left_offset_x = -joystick2->GetAxis(binding_left_x.value.axis, 1.0f, 0);
const auto left_offset_y = -joystick2->GetAxis(binding_left_y.value.axis, 1.0f, 0);
mapping.insert_or_assign(
Settings::NativeAnalog::LStick,
BuildParamPackageForAnalog(joystick2->GetPort(), joystick2->GetGUID(),
binding_left_x.value.axis, binding_left_y.value.axis));
binding_left_x.value.axis, binding_left_y.value.axis,
left_offset_x, left_offset_y));
} else {
joystick->PreSetAxis(binding_left_x.value.axis);
joystick->PreSetAxis(binding_left_y.value.axis);
const auto left_offset_x = -joystick->GetAxis(binding_left_x.value.axis, 1.0f, 0);
const auto left_offset_y = -joystick->GetAxis(binding_left_y.value.axis, 1.0f, 0);
mapping.insert_or_assign(
Settings::NativeAnalog::LStick,
BuildParamPackageForAnalog(joystick->GetPort(), joystick->GetGUID(),
binding_left_x.value.axis, binding_left_y.value.axis));
binding_left_x.value.axis, binding_left_y.value.axis,
left_offset_x, left_offset_y));
}
const auto& binding_right_x =
SDL_GameControllerGetBindForAxis(controller, SDL_CONTROLLER_AXIS_RIGHTX);
const auto& binding_right_y =
SDL_GameControllerGetBindForAxis(controller, SDL_CONTROLLER_AXIS_RIGHTY);
joystick->PreSetAxis(binding_right_x.value.axis);
joystick->PreSetAxis(binding_right_y.value.axis);
const auto right_offset_x = -joystick->GetAxis(binding_right_x.value.axis, 1.0f, 0);
const auto right_offset_y = -joystick->GetAxis(binding_right_y.value.axis, 1.0f, 0);
mapping.insert_or_assign(Settings::NativeAnalog::RStick,
BuildParamPackageForAnalog(joystick->GetPort(), joystick->GetGUID(),
binding_right_x.value.axis,
binding_right_y.value.axis));
binding_right_y.value.axis, right_offset_x,
right_offset_y));
return mapping;
}
@@ -1563,8 +1601,9 @@ public:
}
if (const auto joystick = state.GetSDLJoystickBySDLID(event.jaxis.which)) {
// Set offset to zero since the joystick is not on center
auto params = BuildParamPackageForAnalog(joystick->GetPort(), joystick->GetGUID(),
first_axis, axis);
first_axis, axis, 0, 0);
first_axis = -1;
return params;
}

View File

@@ -0,0 +1,268 @@
add_library(shader_recompiler STATIC
backend/bindings.h
backend/glasm/emit_context.cpp
backend/glasm/emit_context.h
backend/glasm/emit_glasm.cpp
backend/glasm/emit_glasm.h
backend/glasm/emit_glasm_barriers.cpp
backend/glasm/emit_glasm_bitwise_conversion.cpp
backend/glasm/emit_glasm_composite.cpp
backend/glasm/emit_glasm_context_get_set.cpp
backend/glasm/emit_glasm_control_flow.cpp
backend/glasm/emit_glasm_convert.cpp
backend/glasm/emit_glasm_floating_point.cpp
backend/glasm/emit_glasm_image.cpp
backend/glasm/emit_glasm_instructions.h
backend/glasm/emit_glasm_integer.cpp
backend/glasm/emit_glasm_logical.cpp
backend/glasm/emit_glasm_memory.cpp
backend/glasm/emit_glasm_not_implemented.cpp
backend/glasm/emit_glasm_select.cpp
backend/glasm/emit_glasm_shared_memory.cpp
backend/glasm/emit_glasm_special.cpp
backend/glasm/emit_glasm_undefined.cpp
backend/glasm/emit_glasm_warp.cpp
backend/glasm/reg_alloc.cpp
backend/glasm/reg_alloc.h
backend/glsl/emit_context.cpp
backend/glsl/emit_context.h
backend/glsl/emit_glsl.cpp
backend/glsl/emit_glsl.h
backend/glsl/emit_glsl_atomic.cpp
backend/glsl/emit_glsl_barriers.cpp
backend/glsl/emit_glsl_bitwise_conversion.cpp
backend/glsl/emit_glsl_composite.cpp
backend/glsl/emit_glsl_context_get_set.cpp
backend/glsl/emit_glsl_control_flow.cpp
backend/glsl/emit_glsl_convert.cpp
backend/glsl/emit_glsl_floating_point.cpp
backend/glsl/emit_glsl_image.cpp
backend/glsl/emit_glsl_instructions.h
backend/glsl/emit_glsl_integer.cpp
backend/glsl/emit_glsl_logical.cpp
backend/glsl/emit_glsl_memory.cpp
backend/glsl/emit_glsl_not_implemented.cpp
backend/glsl/emit_glsl_select.cpp
backend/glsl/emit_glsl_shared_memory.cpp
backend/glsl/emit_glsl_special.cpp
backend/glsl/emit_glsl_undefined.cpp
backend/glsl/emit_glsl_warp.cpp
backend/glsl/var_alloc.cpp
backend/glsl/var_alloc.h
backend/spirv/emit_context.cpp
backend/spirv/emit_context.h
backend/spirv/emit_spirv.cpp
backend/spirv/emit_spirv.h
backend/spirv/emit_spirv_atomic.cpp
backend/spirv/emit_spirv_barriers.cpp
backend/spirv/emit_spirv_bitwise_conversion.cpp
backend/spirv/emit_spirv_composite.cpp
backend/spirv/emit_spirv_context_get_set.cpp
backend/spirv/emit_spirv_control_flow.cpp
backend/spirv/emit_spirv_convert.cpp
backend/spirv/emit_spirv_floating_point.cpp
backend/spirv/emit_spirv_image.cpp
backend/spirv/emit_spirv_image_atomic.cpp
backend/spirv/emit_spirv_instructions.h
backend/spirv/emit_spirv_integer.cpp
backend/spirv/emit_spirv_logical.cpp
backend/spirv/emit_spirv_memory.cpp
backend/spirv/emit_spirv_select.cpp
backend/spirv/emit_spirv_shared_memory.cpp
backend/spirv/emit_spirv_special.cpp
backend/spirv/emit_spirv_undefined.cpp
backend/spirv/emit_spirv_warp.cpp
environment.h
exception.h
frontend/ir/abstract_syntax_list.h
frontend/ir/attribute.cpp
frontend/ir/attribute.h
frontend/ir/basic_block.cpp
frontend/ir/basic_block.h
frontend/ir/breadth_first_search.h
frontend/ir/condition.cpp
frontend/ir/condition.h
frontend/ir/flow_test.cpp
frontend/ir/flow_test.h
frontend/ir/ir_emitter.cpp
frontend/ir/ir_emitter.h
frontend/ir/microinstruction.cpp
frontend/ir/modifiers.h
frontend/ir/opcodes.cpp
frontend/ir/opcodes.h
frontend/ir/opcodes.inc
frontend/ir/patch.cpp
frontend/ir/patch.h
frontend/ir/post_order.cpp
frontend/ir/post_order.h
frontend/ir/pred.h
frontend/ir/program.cpp
frontend/ir/program.h
frontend/ir/reg.h
frontend/ir/type.cpp
frontend/ir/type.h
frontend/ir/value.cpp
frontend/ir/value.h
frontend/maxwell/control_flow.cpp
frontend/maxwell/control_flow.h
frontend/maxwell/decode.cpp
frontend/maxwell/decode.h
frontend/maxwell/indirect_branch_table_track.cpp
frontend/maxwell/indirect_branch_table_track.h
frontend/maxwell/instruction.h
frontend/maxwell/location.h
frontend/maxwell/maxwell.inc
frontend/maxwell/opcodes.cpp
frontend/maxwell/opcodes.h
frontend/maxwell/structured_control_flow.cpp
frontend/maxwell/structured_control_flow.h
frontend/maxwell/translate/impl/atomic_operations_global_memory.cpp
frontend/maxwell/translate/impl/atomic_operations_shared_memory.cpp
frontend/maxwell/translate/impl/attribute_memory_to_physical.cpp
frontend/maxwell/translate/impl/barrier_operations.cpp
frontend/maxwell/translate/impl/bitfield_extract.cpp
frontend/maxwell/translate/impl/bitfield_insert.cpp
frontend/maxwell/translate/impl/branch_indirect.cpp
frontend/maxwell/translate/impl/common_encoding.h
frontend/maxwell/translate/impl/common_funcs.cpp
frontend/maxwell/translate/impl/common_funcs.h
frontend/maxwell/translate/impl/condition_code_set.cpp
frontend/maxwell/translate/impl/double_add.cpp
frontend/maxwell/translate/impl/double_compare_and_set.cpp
frontend/maxwell/translate/impl/double_fused_multiply_add.cpp
frontend/maxwell/translate/impl/double_min_max.cpp
frontend/maxwell/translate/impl/double_multiply.cpp
frontend/maxwell/translate/impl/double_set_predicate.cpp
frontend/maxwell/translate/impl/exit_program.cpp
frontend/maxwell/translate/impl/find_leading_one.cpp
frontend/maxwell/translate/impl/floating_point_add.cpp
frontend/maxwell/translate/impl/floating_point_compare.cpp
frontend/maxwell/translate/impl/floating_point_compare_and_set.cpp
frontend/maxwell/translate/impl/floating_point_conversion_floating_point.cpp
frontend/maxwell/translate/impl/floating_point_conversion_integer.cpp
frontend/maxwell/translate/impl/floating_point_fused_multiply_add.cpp
frontend/maxwell/translate/impl/floating_point_min_max.cpp
frontend/maxwell/translate/impl/floating_point_multi_function.cpp
frontend/maxwell/translate/impl/floating_point_multiply.cpp
frontend/maxwell/translate/impl/floating_point_range_reduction.cpp
frontend/maxwell/translate/impl/floating_point_set_predicate.cpp
frontend/maxwell/translate/impl/floating_point_swizzled_add.cpp
frontend/maxwell/translate/impl/half_floating_point_add.cpp
frontend/maxwell/translate/impl/half_floating_point_fused_multiply_add.cpp
frontend/maxwell/translate/impl/half_floating_point_helper.cpp
frontend/maxwell/translate/impl/half_floating_point_helper.h
frontend/maxwell/translate/impl/half_floating_point_multiply.cpp
frontend/maxwell/translate/impl/half_floating_point_set.cpp
frontend/maxwell/translate/impl/half_floating_point_set_predicate.cpp
frontend/maxwell/translate/impl/impl.cpp
frontend/maxwell/translate/impl/impl.h
frontend/maxwell/translate/impl/integer_add.cpp
frontend/maxwell/translate/impl/integer_add_three_input.cpp
frontend/maxwell/translate/impl/integer_compare.cpp
frontend/maxwell/translate/impl/integer_compare_and_set.cpp
frontend/maxwell/translate/impl/integer_floating_point_conversion.cpp
frontend/maxwell/translate/impl/integer_funnel_shift.cpp
frontend/maxwell/translate/impl/integer_minimum_maximum.cpp
frontend/maxwell/translate/impl/integer_popcount.cpp
frontend/maxwell/translate/impl/integer_scaled_add.cpp
frontend/maxwell/translate/impl/integer_set_predicate.cpp
frontend/maxwell/translate/impl/integer_shift_left.cpp
frontend/maxwell/translate/impl/integer_shift_right.cpp
frontend/maxwell/translate/impl/integer_short_multiply_add.cpp
frontend/maxwell/translate/impl/integer_to_integer_conversion.cpp
frontend/maxwell/translate/impl/internal_stage_buffer_entry_read.cpp
frontend/maxwell/translate/impl/load_constant.cpp
frontend/maxwell/translate/impl/load_constant.h
frontend/maxwell/translate/impl/load_effective_address.cpp
frontend/maxwell/translate/impl/load_store_attribute.cpp
frontend/maxwell/translate/impl/load_store_local_shared.cpp
frontend/maxwell/translate/impl/load_store_memory.cpp
frontend/maxwell/translate/impl/logic_operation.cpp
frontend/maxwell/translate/impl/logic_operation_three_input.cpp
frontend/maxwell/translate/impl/move_predicate_to_register.cpp
frontend/maxwell/translate/impl/move_register.cpp
frontend/maxwell/translate/impl/move_register_to_predicate.cpp
frontend/maxwell/translate/impl/move_special_register.cpp
frontend/maxwell/translate/impl/not_implemented.cpp
frontend/maxwell/translate/impl/output_geometry.cpp
frontend/maxwell/translate/impl/pixel_load.cpp
frontend/maxwell/translate/impl/predicate_set_predicate.cpp
frontend/maxwell/translate/impl/predicate_set_register.cpp
frontend/maxwell/translate/impl/select_source_with_predicate.cpp
frontend/maxwell/translate/impl/surface_atomic_operations.cpp
frontend/maxwell/translate/impl/surface_load_store.cpp
frontend/maxwell/translate/impl/texture_fetch.cpp
frontend/maxwell/translate/impl/texture_fetch_swizzled.cpp
frontend/maxwell/translate/impl/texture_gather.cpp
frontend/maxwell/translate/impl/texture_gather_swizzled.cpp
frontend/maxwell/translate/impl/texture_gradient.cpp
frontend/maxwell/translate/impl/texture_load.cpp
frontend/maxwell/translate/impl/texture_load_swizzled.cpp
frontend/maxwell/translate/impl/texture_mipmap_level.cpp
frontend/maxwell/translate/impl/texture_query.cpp
frontend/maxwell/translate/impl/video_helper.cpp
frontend/maxwell/translate/impl/video_helper.h
frontend/maxwell/translate/impl/video_minimum_maximum.cpp
frontend/maxwell/translate/impl/video_multiply_add.cpp
frontend/maxwell/translate/impl/video_set_predicate.cpp
frontend/maxwell/translate/impl/vote.cpp
frontend/maxwell/translate/impl/warp_shuffle.cpp
frontend/maxwell/translate/translate.cpp
frontend/maxwell/translate/translate.h
frontend/maxwell/translate_program.cpp
frontend/maxwell/translate_program.h
host_translate_info.h
ir_opt/collect_shader_info_pass.cpp
ir_opt/constant_propagation_pass.cpp
ir_opt/dead_code_elimination_pass.cpp
ir_opt/dual_vertex_pass.cpp
ir_opt/global_memory_to_storage_buffer_pass.cpp
ir_opt/identity_removal_pass.cpp
ir_opt/lower_fp16_to_fp32.cpp
ir_opt/lower_int64_to_int32.cpp
ir_opt/passes.h
ir_opt/ssa_rewrite_pass.cpp
ir_opt/texture_pass.cpp
ir_opt/verification_pass.cpp
object_pool.h
profile.h
program_header.h
runtime_info.h
shader_info.h
varying_state.h
)
target_link_libraries(shader_recompiler PUBLIC common fmt::fmt sirit)
if (MSVC)
target_compile_options(shader_recompiler PRIVATE
/W4
/WX
/we4018 # 'expression' : signed/unsigned mismatch
/we4244 # 'argument' : conversion from 'type1' to 'type2', possible loss of data (floating-point)
/we4245 # 'conversion' : conversion from 'type1' to 'type2', signed/unsigned mismatch
/we4254 # 'operator': conversion from 'type1:field_bits' to 'type2:field_bits', possible loss of data
/we4267 # 'var' : conversion from 'size_t' to 'type', possible loss of data
/we4305 # 'context' : truncation from 'type1' to 'type2'
/we4800 # Implicit conversion from 'type' to bool. Possible information loss
/we4826 # Conversion from 'type1' to 'type2' is sign-extended. This may cause unexpected runtime behavior.
)
else()
target_compile_options(shader_recompiler PRIVATE
-Werror
-Werror=conversion
-Werror=ignored-qualifiers
-Werror=implicit-fallthrough
-Werror=shadow
-Werror=sign-compare
$<$<CXX_COMPILER_ID:GNU>:-Werror=unused-but-set-parameter>
$<$<CXX_COMPILER_ID:GNU>:-Werror=unused-but-set-variable>
-Werror=unused-variable
# Bracket depth determines maximum size of a fold expression in Clang since 9c9974c3ccb6.
# And this in turns limits the size of a std::array.
$<$<CXX_COMPILER_ID:Clang>:-fbracket-depth=1024>
)
endif()
create_target_directory_groups(shader_recompiler)

View File

@@ -0,0 +1,19 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "common/common_types.h"
namespace Shader::Backend {
struct Bindings {
u32 unified{};
u32 uniform_buffer{};
u32 storage_buffer{};
u32 texture{};
u32 image{};
};
} // namespace Shader::Backend

View File

@@ -0,0 +1,154 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <string_view>
#include "shader_recompiler/backend/bindings.h"
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/frontend/ir/program.h"
#include "shader_recompiler/profile.h"
#include "shader_recompiler/runtime_info.h"
namespace Shader::Backend::GLASM {
namespace {
std::string_view InterpDecorator(Interpolation interp) {
switch (interp) {
case Interpolation::Smooth:
return "";
case Interpolation::Flat:
return "FLAT ";
case Interpolation::NoPerspective:
return "NOPERSPECTIVE ";
}
throw InvalidArgument("Invalid interpolation {}", interp);
}
bool IsInputArray(Stage stage) {
return stage == Stage::Geometry || stage == Stage::TessellationControl ||
stage == Stage::TessellationEval;
}
} // Anonymous namespace
EmitContext::EmitContext(IR::Program& program, Bindings& bindings, const Profile& profile_,
const RuntimeInfo& runtime_info_)
: info{program.info}, profile{profile_}, runtime_info{runtime_info_} {
// FIXME: Temporary partial implementation
u32 cbuf_index{};
for (const auto& desc : info.constant_buffer_descriptors) {
if (desc.count != 1) {
throw NotImplementedException("Constant buffer descriptor array");
}
Add("CBUFFER c{}[]={{program.buffer[{}]}};", desc.index, cbuf_index);
++cbuf_index;
}
u32 ssbo_index{};
for (const auto& desc : info.storage_buffers_descriptors) {
if (desc.count != 1) {
throw NotImplementedException("Storage buffer descriptor array");
}
if (runtime_info.glasm_use_storage_buffers) {
Add("STORAGE ssbo{}[]={{program.storage[{}]}};", ssbo_index, bindings.storage_buffer);
++bindings.storage_buffer;
++ssbo_index;
}
}
if (!runtime_info.glasm_use_storage_buffers) {
if (const size_t num = info.storage_buffers_descriptors.size(); num > 0) {
Add("PARAM c[{}]={{program.local[0..{}]}};", num, num - 1);
}
}
stage = program.stage;
switch (program.stage) {
case Stage::VertexA:
case Stage::VertexB:
stage_name = "vertex";
attrib_name = "vertex";
break;
case Stage::TessellationControl:
case Stage::TessellationEval:
stage_name = "primitive";
attrib_name = "primitive";
break;
case Stage::Geometry:
stage_name = "primitive";
attrib_name = "vertex";
break;
case Stage::Fragment:
stage_name = "fragment";
attrib_name = "fragment";
break;
case Stage::Compute:
stage_name = "invocation";
break;
}
const std::string_view attr_stage{stage == Stage::Fragment ? "fragment" : "vertex"};
const VaryingState loads{info.loads.mask | info.passthrough.mask};
for (size_t index = 0; index < IR::NUM_GENERICS; ++index) {
if (loads.Generic(index)) {
Add("{}ATTRIB in_attr{}[]={{{}.attrib[{}..{}]}};",
InterpDecorator(info.interpolation[index]), index, attr_stage, index, index);
}
}
if (IsInputArray(stage) && loads.AnyComponent(IR::Attribute::PositionX)) {
Add("ATTRIB vertex_position=vertex.position;");
}
if (info.uses_invocation_id) {
Add("ATTRIB primitive_invocation=primitive.invocation;");
}
if (info.stores_tess_level_outer) {
Add("OUTPUT result_patch_tessouter[]={{result.patch.tessouter[0..3]}};");
}
if (info.stores_tess_level_inner) {
Add("OUTPUT result_patch_tessinner[]={{result.patch.tessinner[0..1]}};");
}
if (info.stores.ClipDistances()) {
Add("OUTPUT result_clip[]={{result.clip[0..7]}};");
}
for (size_t index = 0; index < info.uses_patches.size(); ++index) {
if (!info.uses_patches[index]) {
continue;
}
if (stage == Stage::TessellationControl) {
Add("OUTPUT result_patch_attrib{}[]={{result.patch.attrib[{}..{}]}};"
"ATTRIB primitive_out_patch_attrib{}[]={{primitive.out.patch.attrib[{}..{}]}};",
index, index, index, index, index, index);
} else {
Add("ATTRIB primitive_patch_attrib{}[]={{primitive.patch.attrib[{}..{}]}};", index,
index, index);
}
}
if (stage == Stage::Fragment) {
Add("OUTPUT frag_color0=result.color;");
for (size_t index = 1; index < info.stores_frag_color.size(); ++index) {
Add("OUTPUT frag_color{}=result.color[{}];", index, index);
}
}
for (size_t index = 0; index < IR::NUM_GENERICS; ++index) {
if (info.stores.Generic(index)) {
Add("OUTPUT out_attr{}[]={{result.attrib[{}..{}]}};", index, index, index);
}
}
image_buffer_bindings.reserve(info.image_buffer_descriptors.size());
for (const auto& desc : info.image_buffer_descriptors) {
image_buffer_bindings.push_back(bindings.image);
bindings.image += desc.count;
}
image_bindings.reserve(info.image_descriptors.size());
for (const auto& desc : info.image_descriptors) {
image_bindings.push_back(bindings.image);
bindings.image += desc.count;
}
texture_buffer_bindings.reserve(info.texture_buffer_descriptors.size());
for (const auto& desc : info.texture_buffer_descriptors) {
texture_buffer_bindings.push_back(bindings.texture);
bindings.texture += desc.count;
}
texture_bindings.reserve(info.texture_descriptors.size());
for (const auto& desc : info.texture_descriptors) {
texture_bindings.push_back(bindings.texture);
bindings.texture += desc.count;
}
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,80 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <string>
#include <utility>
#include <vector>
#include <fmt/format.h>
#include "shader_recompiler/backend/glasm/reg_alloc.h"
#include "shader_recompiler/stage.h"
namespace Shader {
struct Info;
struct Profile;
struct RuntimeInfo;
} // namespace Shader
namespace Shader::Backend {
struct Bindings;
}
namespace Shader::IR {
class Inst;
struct Program;
} // namespace Shader::IR
namespace Shader::Backend::GLASM {
class EmitContext {
public:
explicit EmitContext(IR::Program& program, Bindings& bindings, const Profile& profile_,
const RuntimeInfo& runtime_info_);
template <typename... Args>
void Add(const char* format_str, IR::Inst& inst, Args&&... args) {
code += fmt::format(fmt::runtime(format_str), reg_alloc.Define(inst),
std::forward<Args>(args)...);
// TODO: Remove this
code += '\n';
}
template <typename... Args>
void LongAdd(const char* format_str, IR::Inst& inst, Args&&... args) {
code += fmt::format(fmt::runtime(format_str), reg_alloc.LongDefine(inst),
std::forward<Args>(args)...);
// TODO: Remove this
code += '\n';
}
template <typename... Args>
void Add(const char* format_str, Args&&... args) {
code += fmt::format(fmt::runtime(format_str), std::forward<Args>(args)...);
// TODO: Remove this
code += '\n';
}
std::string code;
RegAlloc reg_alloc{};
const Info& info;
const Profile& profile;
const RuntimeInfo& runtime_info;
std::vector<u32> texture_buffer_bindings;
std::vector<u32> image_buffer_bindings;
std::vector<u32> texture_bindings;
std::vector<u32> image_bindings;
Stage stage{};
std::string_view stage_name = "invalid";
std::string_view attrib_name = "invalid";
u32 num_safety_loop_vars{};
bool uses_y_direction{};
};
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,492 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <algorithm>
#include <string>
#include <tuple>
#include "common/div_ceil.h"
#include "common/settings.h"
#include "shader_recompiler/backend/bindings.h"
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/ir_emitter.h"
#include "shader_recompiler/frontend/ir/program.h"
#include "shader_recompiler/profile.h"
#include "shader_recompiler/runtime_info.h"
namespace Shader::Backend::GLASM {
namespace {
template <class Func>
struct FuncTraits {};
template <class ReturnType_, class... Args>
struct FuncTraits<ReturnType_ (*)(Args...)> {
using ReturnType = ReturnType_;
static constexpr size_t NUM_ARGS = sizeof...(Args);
template <size_t I>
using ArgType = std::tuple_element_t<I, std::tuple<Args...>>;
};
template <typename T>
struct Identity {
Identity(T data_) : data{data_} {}
T Extract() {
return data;
}
T data;
};
template <bool scalar>
class RegWrapper {
public:
RegWrapper(EmitContext& ctx, const IR::Value& ir_value) : reg_alloc{ctx.reg_alloc} {
const Value value{reg_alloc.Peek(ir_value)};
if (value.type == Type::Register) {
inst = ir_value.InstRecursive();
reg = Register{value};
} else {
reg = value.type == Type::U64 ? reg_alloc.AllocLongReg() : reg_alloc.AllocReg();
}
switch (value.type) {
case Type::Register:
case Type::Void:
break;
case Type::U32:
ctx.Add("MOV.U {}.x,{};", reg, value.imm_u32);
break;
case Type::U64:
ctx.Add("MOV.U64 {}.x,{};", reg, value.imm_u64);
break;
}
}
auto Extract() {
if (inst) {
reg_alloc.Unref(*inst);
} else {
reg_alloc.FreeReg(reg);
}
return std::conditional_t<scalar, ScalarRegister, Register>{Value{reg}};
}
private:
RegAlloc& reg_alloc;
IR::Inst* inst{};
Register reg{};
};
template <typename ArgType>
class ValueWrapper {
public:
ValueWrapper(EmitContext& ctx, const IR::Value& ir_value_)
: reg_alloc{ctx.reg_alloc}, ir_value{ir_value_}, value{reg_alloc.Peek(ir_value)} {}
ArgType Extract() {
if (!ir_value.IsImmediate()) {
reg_alloc.Unref(*ir_value.InstRecursive());
}
return value;
}
private:
RegAlloc& reg_alloc;
const IR::Value& ir_value;
ArgType value;
};
template <typename ArgType>
auto Arg(EmitContext& ctx, const IR::Value& arg) {
if constexpr (std::is_same_v<ArgType, Register>) {
return RegWrapper<false>{ctx, arg};
} else if constexpr (std::is_same_v<ArgType, ScalarRegister>) {
return RegWrapper<true>{ctx, arg};
} else if constexpr (std::is_base_of_v<Value, ArgType>) {
return ValueWrapper<ArgType>{ctx, arg};
} else if constexpr (std::is_same_v<ArgType, const IR::Value&>) {
return Identity<const IR::Value&>{arg};
} else if constexpr (std::is_same_v<ArgType, u32>) {
return Identity{arg.U32()};
} else if constexpr (std::is_same_v<ArgType, IR::Attribute>) {
return Identity{arg.Attribute()};
} else if constexpr (std::is_same_v<ArgType, IR::Patch>) {
return Identity{arg.Patch()};
} else if constexpr (std::is_same_v<ArgType, IR::Reg>) {
return Identity{arg.Reg()};
}
}
template <auto func, bool is_first_arg_inst>
struct InvokeCall {
template <typename... Args>
InvokeCall(EmitContext& ctx, IR::Inst* inst, Args&&... args) {
if constexpr (is_first_arg_inst) {
func(ctx, *inst, args.Extract()...);
} else {
func(ctx, args.Extract()...);
}
}
};
template <auto func, bool is_first_arg_inst, size_t... I>
void Invoke(EmitContext& ctx, IR::Inst* inst, std::index_sequence<I...>) {
using Traits = FuncTraits<decltype(func)>;
if constexpr (is_first_arg_inst) {
InvokeCall<func, is_first_arg_inst>{
ctx, inst, Arg<typename Traits::template ArgType<I + 2>>(ctx, inst->Arg(I))...};
} else {
InvokeCall<func, is_first_arg_inst>{
ctx, inst, Arg<typename Traits::template ArgType<I + 1>>(ctx, inst->Arg(I))...};
}
}
template <auto func>
void Invoke(EmitContext& ctx, IR::Inst* inst) {
using Traits = FuncTraits<decltype(func)>;
static_assert(Traits::NUM_ARGS >= 1, "Insufficient arguments");
if constexpr (Traits::NUM_ARGS == 1) {
Invoke<func, false>(ctx, inst, std::make_index_sequence<0>{});
} else {
using FirstArgType = typename Traits::template ArgType<1>;
static constexpr bool is_first_arg_inst = std::is_same_v<FirstArgType, IR::Inst&>;
using Indices = std::make_index_sequence<Traits::NUM_ARGS - (is_first_arg_inst ? 2 : 1)>;
Invoke<func, is_first_arg_inst>(ctx, inst, Indices{});
}
}
void EmitInst(EmitContext& ctx, IR::Inst* inst) {
switch (inst->GetOpcode()) {
#define OPCODE(name, result_type, ...) \
case IR::Opcode::name: \
return Invoke<&Emit##name>(ctx, inst);
#include "shader_recompiler/frontend/ir/opcodes.inc"
#undef OPCODE
}
throw LogicError("Invalid opcode {}", inst->GetOpcode());
}
bool IsReference(IR::Inst& inst) {
return inst.GetOpcode() == IR::Opcode::Reference;
}
void PrecolorInst(IR::Inst& phi) {
// Insert phi moves before references to avoid overwritting other phis
const size_t num_args{phi.NumArgs()};
for (size_t i = 0; i < num_args; ++i) {
IR::Block& phi_block{*phi.PhiBlock(i)};
auto it{std::find_if_not(phi_block.rbegin(), phi_block.rend(), IsReference).base()};
IR::IREmitter ir{phi_block, it};
const IR::Value arg{phi.Arg(i)};
if (arg.IsImmediate()) {
ir.PhiMove(phi, arg);
} else {
ir.PhiMove(phi, IR::Value{&RegAlloc::AliasInst(*arg.Inst())});
}
}
for (size_t i = 0; i < num_args; ++i) {
IR::IREmitter{*phi.PhiBlock(i)}.Reference(IR::Value{&phi});
}
}
void Precolor(const IR::Program& program) {
for (IR::Block* const block : program.blocks) {
for (IR::Inst& phi : block->Instructions()) {
if (!IR::IsPhi(phi)) {
break;
}
PrecolorInst(phi);
}
}
}
void EmitCode(EmitContext& ctx, const IR::Program& program) {
const auto eval{
[&](const IR::U1& cond) { return ScalarS32{ctx.reg_alloc.Consume(IR::Value{cond})}; }};
for (const IR::AbstractSyntaxNode& node : program.syntax_list) {
switch (node.type) {
case IR::AbstractSyntaxNode::Type::Block:
for (IR::Inst& inst : node.data.block->Instructions()) {
EmitInst(ctx, &inst);
}
break;
case IR::AbstractSyntaxNode::Type::If:
ctx.Add("MOV.S.CC RC,{};"
"IF NE.x;",
eval(node.data.if_node.cond));
break;
case IR::AbstractSyntaxNode::Type::EndIf:
ctx.Add("ENDIF;");
break;
case IR::AbstractSyntaxNode::Type::Loop:
ctx.Add("REP;");
break;
case IR::AbstractSyntaxNode::Type::Repeat:
if (!Settings::values.disable_shader_loop_safety_checks) {
const u32 loop_index{ctx.num_safety_loop_vars++};
const u32 vector_index{loop_index / 4};
const char component{"xyzw"[loop_index % 4]};
ctx.Add("SUB.S.CC loop{}.{},loop{}.{},1;"
"BRK(LT.{});",
vector_index, component, vector_index, component, component);
}
if (node.data.repeat.cond.IsImmediate()) {
if (node.data.repeat.cond.U1()) {
ctx.Add("ENDREP;");
} else {
ctx.Add("BRK;"
"ENDREP;");
}
} else {
ctx.Add("MOV.S.CC RC,{};"
"BRK(EQ.x);"
"ENDREP;",
eval(node.data.repeat.cond));
}
break;
case IR::AbstractSyntaxNode::Type::Break:
if (node.data.break_node.cond.IsImmediate()) {
if (node.data.break_node.cond.U1()) {
ctx.Add("BRK;");
}
} else {
ctx.Add("MOV.S.CC RC,{};"
"BRK (NE.x);",
eval(node.data.break_node.cond));
}
break;
case IR::AbstractSyntaxNode::Type::Return:
case IR::AbstractSyntaxNode::Type::Unreachable:
ctx.Add("RET;");
break;
}
}
if (!ctx.reg_alloc.IsEmpty()) {
LOG_WARNING(Shader_GLASM, "Register leak after generating code");
}
}
void SetupOptions(const IR::Program& program, const Profile& profile,
const RuntimeInfo& runtime_info, std::string& header) {
const Info& info{program.info};
const Stage stage{program.stage};
// TODO: Track the shared atomic ops
header += "OPTION NV_internal;"
"OPTION NV_shader_storage_buffer;"
"OPTION NV_gpu_program_fp64;";
if (info.uses_int64_bit_atomics) {
header += "OPTION NV_shader_atomic_int64;";
}
if (info.uses_atomic_f32_add) {
header += "OPTION NV_shader_atomic_float;";
}
if (info.uses_atomic_f16x2_add || info.uses_atomic_f16x2_min || info.uses_atomic_f16x2_max) {
header += "OPTION NV_shader_atomic_fp16_vector;";
}
if (info.uses_subgroup_invocation_id || info.uses_subgroup_mask || info.uses_subgroup_vote ||
info.uses_fswzadd) {
header += "OPTION NV_shader_thread_group;";
}
if (info.uses_subgroup_shuffles) {
header += "OPTION NV_shader_thread_shuffle;";
}
if (info.uses_sparse_residency) {
header += "OPTION EXT_sparse_texture2;";
}
const bool stores_viewport_layer{info.stores[IR::Attribute::ViewportIndex] ||
info.stores[IR::Attribute::Layer]};
if ((stage != Stage::Geometry && stores_viewport_layer) ||
info.stores[IR::Attribute::ViewportMask]) {
if (profile.support_viewport_index_layer_non_geometry) {
header += "OPTION NV_viewport_array2;";
}
}
if (program.is_geometry_passthrough && profile.support_geometry_shader_passthrough) {
header += "OPTION NV_geometry_shader_passthrough;";
}
if (info.uses_typeless_image_reads && profile.support_typeless_image_loads) {
header += "OPTION EXT_shader_image_load_formatted;";
}
if (profile.support_derivative_control) {
header += "OPTION ARB_derivative_control;";
}
if (stage == Stage::Fragment && runtime_info.force_early_z != 0) {
header += "OPTION NV_early_fragment_tests;";
}
if (stage == Stage::Fragment) {
header += "OPTION ARB_draw_buffers;";
}
}
std::string_view StageHeader(Stage stage) {
switch (stage) {
case Stage::VertexA:
case Stage::VertexB:
return "!!NVvp5.0\n";
case Stage::TessellationControl:
return "!!NVtcp5.0\n";
case Stage::TessellationEval:
return "!!NVtep5.0\n";
case Stage::Geometry:
return "!!NVgp5.0\n";
case Stage::Fragment:
return "!!NVfp5.0\n";
case Stage::Compute:
return "!!NVcp5.0\n";
}
throw InvalidArgument("Invalid stage {}", stage);
}
std::string_view InputPrimitive(InputTopology topology) {
switch (topology) {
case InputTopology::Points:
return "POINTS";
case InputTopology::Lines:
return "LINES";
case InputTopology::LinesAdjacency:
return "LINES_ADJACENCY";
case InputTopology::Triangles:
return "TRIANGLES";
case InputTopology::TrianglesAdjacency:
return "TRIANGLES_ADJACENCY";
}
throw InvalidArgument("Invalid input topology {}", topology);
}
std::string_view OutputPrimitive(OutputTopology topology) {
switch (topology) {
case OutputTopology::PointList:
return "POINTS";
case OutputTopology::LineStrip:
return "LINE_STRIP";
case OutputTopology::TriangleStrip:
return "TRIANGLE_STRIP";
}
throw InvalidArgument("Invalid output topology {}", topology);
}
std::string_view GetTessMode(TessPrimitive primitive) {
switch (primitive) {
case TessPrimitive::Triangles:
return "TRIANGLES";
case TessPrimitive::Quads:
return "QUADS";
case TessPrimitive::Isolines:
return "ISOLINES";
}
throw InvalidArgument("Invalid tessellation primitive {}", primitive);
}
std::string_view GetTessSpacing(TessSpacing spacing) {
switch (spacing) {
case TessSpacing::Equal:
return "EQUAL";
case TessSpacing::FractionalOdd:
return "FRACTIONAL_ODD";
case TessSpacing::FractionalEven:
return "FRACTIONAL_EVEN";
}
throw InvalidArgument("Invalid tessellation spacing {}", spacing);
}
} // Anonymous namespace
std::string EmitGLASM(const Profile& profile, const RuntimeInfo& runtime_info, IR::Program& program,
Bindings& bindings) {
EmitContext ctx{program, bindings, profile, runtime_info};
Precolor(program);
EmitCode(ctx, program);
std::string header{StageHeader(program.stage)};
SetupOptions(program, profile, runtime_info, header);
switch (program.stage) {
case Stage::TessellationControl:
header += fmt::format("VERTICES_OUT {};", program.invocations);
break;
case Stage::TessellationEval:
header += fmt::format("TESS_MODE {};"
"TESS_SPACING {};"
"TESS_VERTEX_ORDER {};",
GetTessMode(runtime_info.tess_primitive),
GetTessSpacing(runtime_info.tess_spacing),
runtime_info.tess_clockwise ? "CW" : "CCW");
break;
case Stage::Geometry:
header += fmt::format("PRIMITIVE_IN {};", InputPrimitive(runtime_info.input_topology));
if (program.is_geometry_passthrough) {
if (profile.support_geometry_shader_passthrough) {
for (size_t index = 0; index < IR::NUM_GENERICS; ++index) {
if (program.info.passthrough.Generic(index)) {
header += fmt::format("PASSTHROUGH result.attrib[{}];", index);
}
}
if (program.info.passthrough.AnyComponent(IR::Attribute::PositionX)) {
header += "PASSTHROUGH result.position;";
}
} else {
LOG_WARNING(Shader_GLASM, "Passthrough geometry program used but not supported");
}
} else {
header +=
fmt::format("VERTICES_OUT {};"
"PRIMITIVE_OUT {};",
program.output_vertices, OutputPrimitive(program.output_topology));
}
break;
case Stage::Compute:
header += fmt::format("GROUP_SIZE {} {} {};", program.workgroup_size[0],
program.workgroup_size[1], program.workgroup_size[2]);
break;
default:
break;
}
if (program.shared_memory_size > 0) {
header += fmt::format("SHARED_MEMORY {};", program.shared_memory_size);
header += fmt::format("SHARED shared_mem[]={{program.sharedmem}};");
}
header += "TEMP ";
for (size_t index = 0; index < ctx.reg_alloc.NumUsedRegisters(); ++index) {
header += fmt::format("R{},", index);
}
if (program.local_memory_size > 0) {
header += fmt::format("lmem[{}],", program.local_memory_size);
}
if (program.info.uses_fswzadd) {
header += "FSWZA[4],FSWZB[4],";
}
const u32 num_safety_loop_vectors{Common::DivCeil(ctx.num_safety_loop_vars, 4u)};
for (u32 index = 0; index < num_safety_loop_vectors; ++index) {
header += fmt::format("loop{},", index);
}
header += "RC;"
"LONG TEMP ";
for (size_t index = 0; index < ctx.reg_alloc.NumUsedLongRegisters(); ++index) {
header += fmt::format("D{},", index);
}
header += "DC;";
if (program.info.uses_fswzadd) {
header += "MOV.F FSWZA[0],-1;"
"MOV.F FSWZA[1],1;"
"MOV.F FSWZA[2],-1;"
"MOV.F FSWZA[3],0;"
"MOV.F FSWZB[0],-1;"
"MOV.F FSWZB[1],-1;"
"MOV.F FSWZB[2],1;"
"MOV.F FSWZB[3],-1;";
}
for (u32 index = 0; index < num_safety_loop_vectors; ++index) {
header += fmt::format("MOV.S loop{},{{0x2000,0x2000,0x2000,0x2000}};", index);
}
if (ctx.uses_y_direction) {
header += "PARAM y_direction[1]={state.material.front.ambient};";
}
ctx.code.insert(0, header);
ctx.code += "END";
return ctx.code;
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,25 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <string>
#include "shader_recompiler/backend/bindings.h"
#include "shader_recompiler/frontend/ir/program.h"
#include "shader_recompiler/profile.h"
#include "shader_recompiler/runtime_info.h"
namespace Shader::Backend::GLASM {
[[nodiscard]] std::string EmitGLASM(const Profile& profile, const RuntimeInfo& runtime_info,
IR::Program& program, Bindings& bindings);
[[nodiscard]] inline std::string EmitGLASM(const Profile& profile, const RuntimeInfo& runtime_info,
IR::Program& program) {
Bindings binding;
return EmitGLASM(profile, runtime_info, program, binding);
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,91 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
static void Alias(IR::Inst& inst, const IR::Value& value) {
if (value.IsImmediate()) {
return;
}
IR::Inst& value_inst{RegAlloc::AliasInst(*value.Inst())};
value_inst.DestructiveAddUsage(inst.UseCount());
value_inst.DestructiveRemoveUsage();
inst.SetDefinition(value_inst.Definition<Id>());
}
void EmitIdentity(EmitContext&, IR::Inst& inst, const IR::Value& value) {
Alias(inst, value);
}
void EmitConditionRef(EmitContext& ctx, IR::Inst& inst, const IR::Value& value) {
// Fake one usage to get a real register out of the condition
inst.DestructiveAddUsage(1);
const Register ret{ctx.reg_alloc.Define(inst)};
const ScalarS32 input{ctx.reg_alloc.Consume(value)};
if (ret != input) {
ctx.Add("MOV.S {},{};", ret, input);
}
}
void EmitBitCastU16F16(EmitContext&, IR::Inst& inst, const IR::Value& value) {
Alias(inst, value);
}
void EmitBitCastU32F32(EmitContext&, IR::Inst& inst, const IR::Value& value) {
Alias(inst, value);
}
void EmitBitCastU64F64(EmitContext&, IR::Inst& inst, const IR::Value& value) {
Alias(inst, value);
}
void EmitBitCastF16U16(EmitContext&, IR::Inst& inst, const IR::Value& value) {
Alias(inst, value);
}
void EmitBitCastF32U32(EmitContext&, IR::Inst& inst, const IR::Value& value) {
Alias(inst, value);
}
void EmitBitCastF64U64(EmitContext&, IR::Inst& inst, const IR::Value& value) {
Alias(inst, value);
}
void EmitPackUint2x32(EmitContext& ctx, IR::Inst& inst, Register value) {
ctx.LongAdd("PK64.U {}.x,{};", inst, value);
}
void EmitUnpackUint2x32(EmitContext& ctx, IR::Inst& inst, Register value) {
ctx.Add("UP64.U {}.xy,{}.x;", inst, value);
}
void EmitPackFloat2x16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitUnpackFloat2x16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitPackHalf2x16(EmitContext& ctx, IR::Inst& inst, Register value) {
ctx.Add("PK2H {}.x,{};", inst, value);
}
void EmitUnpackHalf2x16(EmitContext& ctx, IR::Inst& inst, Register value) {
ctx.Add("UP2H {}.xy,{}.x;", inst, value);
}
void EmitPackDouble2x32(EmitContext& ctx, IR::Inst& inst, Register value) {
ctx.LongAdd("PK64 {}.x,{};", inst, value);
}
void EmitUnpackDouble2x32(EmitContext& ctx, IR::Inst& inst, Register value) {
ctx.Add("UP64 {}.xy,{}.x;", inst, value);
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,244 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
namespace {
template <auto read_imm, char type, typename... Values>
void CompositeConstruct(EmitContext& ctx, IR::Inst& inst, Values&&... elements) {
const Register ret{ctx.reg_alloc.Define(inst)};
if (std::ranges::any_of(std::array{elements...},
[](const IR::Value& value) { return value.IsImmediate(); })) {
using Type = std::invoke_result_t<decltype(read_imm), IR::Value>;
const std::array<Type, 4> values{(elements.IsImmediate() ? (elements.*read_imm)() : 0)...};
ctx.Add("MOV.{} {},{{{},{},{},{}}};", type, ret, fmt::to_string(values[0]),
fmt::to_string(values[1]), fmt::to_string(values[2]), fmt::to_string(values[3]));
}
size_t index{};
for (const IR::Value& element : {elements...}) {
if (!element.IsImmediate()) {
const ScalarU32 value{ctx.reg_alloc.Consume(element)};
ctx.Add("MOV.{} {}.{},{};", type, ret, "xyzw"[index], value);
}
++index;
}
}
void CompositeExtract(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index, char type) {
const Register ret{ctx.reg_alloc.Define(inst)};
if (ret == composite && index == 0) {
// No need to do anything here, the source and destination are the same register
return;
}
ctx.Add("MOV.{} {}.x,{}.{};", type, ret, composite, "xyzw"[index]);
}
template <typename ObjectType>
void CompositeInsert(EmitContext& ctx, IR::Inst& inst, Register composite, ObjectType object,
u32 index, char type) {
const Register ret{ctx.reg_alloc.Define(inst)};
const char swizzle{"xyzw"[index]};
if (ret != composite && ret == object) {
// The object is aliased with the return value, so we have to use a temporary to insert
ctx.Add("MOV.{} RC,{};"
"MOV.{} RC.{},{};"
"MOV.{} {},RC;",
type, composite, type, swizzle, object, type, ret);
} else if (ret != composite) {
// The input composite is not aliased with the return value so we have to copy it before
// hand. But the insert object is not aliased with the return value, so we don't have to
// worry about that
ctx.Add("MOV.{} {},{};"
"MOV.{} {}.{},{};",
type, ret, composite, type, ret, swizzle, object);
} else {
// The return value is alised so we can just insert the object, it doesn't matter if it's
// aliased
ctx.Add("MOV.{} {}.{},{};", type, ret, swizzle, object);
}
}
} // Anonymous namespace
void EmitCompositeConstructU32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2) {
CompositeConstruct<&IR::Value::U32, 'U'>(ctx, inst, e1, e2);
}
void EmitCompositeConstructU32x3(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2, const IR::Value& e3) {
CompositeConstruct<&IR::Value::U32, 'U'>(ctx, inst, e1, e2, e3);
}
void EmitCompositeConstructU32x4(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2, const IR::Value& e3, const IR::Value& e4) {
CompositeConstruct<&IR::Value::U32, 'U'>(ctx, inst, e1, e2, e3, e4);
}
void EmitCompositeExtractU32x2(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index) {
CompositeExtract(ctx, inst, composite, index, 'U');
}
void EmitCompositeExtractU32x3(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index) {
CompositeExtract(ctx, inst, composite, index, 'U');
}
void EmitCompositeExtractU32x4(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index) {
CompositeExtract(ctx, inst, composite, index, 'U');
}
void EmitCompositeInsertU32x2([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite,
[[maybe_unused]] ScalarU32 object, [[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeInsertU32x3([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite,
[[maybe_unused]] ScalarU32 object, [[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeInsertU32x4([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite,
[[maybe_unused]] ScalarU32 object, [[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeConstructF16x2([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register e1,
[[maybe_unused]] Register e2) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeConstructF16x3([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register e1,
[[maybe_unused]] Register e2, [[maybe_unused]] Register e3) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeConstructF16x4([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register e1,
[[maybe_unused]] Register e2, [[maybe_unused]] Register e3,
[[maybe_unused]] Register e4) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeExtractF16x2([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeExtractF16x3([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeExtractF16x4([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeInsertF16x2([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] Register object,
[[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeInsertF16x3([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] Register object,
[[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeInsertF16x4([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] Register object,
[[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeConstructF32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2) {
CompositeConstruct<&IR::Value::F32, 'F'>(ctx, inst, e1, e2);
}
void EmitCompositeConstructF32x3(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2, const IR::Value& e3) {
CompositeConstruct<&IR::Value::F32, 'F'>(ctx, inst, e1, e2, e3);
}
void EmitCompositeConstructF32x4(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2, const IR::Value& e3, const IR::Value& e4) {
CompositeConstruct<&IR::Value::F32, 'F'>(ctx, inst, e1, e2, e3, e4);
}
void EmitCompositeExtractF32x2(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index) {
CompositeExtract(ctx, inst, composite, index, 'F');
}
void EmitCompositeExtractF32x3(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index) {
CompositeExtract(ctx, inst, composite, index, 'F');
}
void EmitCompositeExtractF32x4(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index) {
CompositeExtract(ctx, inst, composite, index, 'F');
}
void EmitCompositeInsertF32x2(EmitContext& ctx, IR::Inst& inst, Register composite,
ScalarF32 object, u32 index) {
CompositeInsert(ctx, inst, composite, object, index, 'F');
}
void EmitCompositeInsertF32x3(EmitContext& ctx, IR::Inst& inst, Register composite,
ScalarF32 object, u32 index) {
CompositeInsert(ctx, inst, composite, object, index, 'F');
}
void EmitCompositeInsertF32x4(EmitContext& ctx, IR::Inst& inst, Register composite,
ScalarF32 object, u32 index) {
CompositeInsert(ctx, inst, composite, object, index, 'F');
}
void EmitCompositeConstructF64x2([[maybe_unused]] EmitContext& ctx) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeConstructF64x3([[maybe_unused]] EmitContext& ctx) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeConstructF64x4([[maybe_unused]] EmitContext& ctx) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeExtractF64x2([[maybe_unused]] EmitContext& ctx) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeExtractF64x3([[maybe_unused]] EmitContext& ctx) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeExtractF64x4([[maybe_unused]] EmitContext& ctx) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeInsertF64x2([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] Register object,
[[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeInsertF64x3([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] Register object,
[[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
void EmitCompositeInsertF64x4([[maybe_unused]] EmitContext& ctx,
[[maybe_unused]] Register composite, [[maybe_unused]] Register object,
[[maybe_unused]] u32 index) {
throw NotImplementedException("GLASM instruction");
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,346 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <string_view>
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
#include "shader_recompiler/profile.h"
#include "shader_recompiler/shader_info.h"
namespace Shader::Backend::GLASM {
namespace {
void GetCbuf(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset,
std::string_view size) {
if (!binding.IsImmediate()) {
throw NotImplementedException("Indirect constant buffer loading");
}
const Register ret{ctx.reg_alloc.Define(inst)};
if (offset.type == Type::U32) {
// Avoid reading arrays out of bounds, matching hardware's behavior
if (offset.imm_u32 >= 0x10'000) {
ctx.Add("MOV.S {},0;", ret);
return;
}
}
ctx.Add("LDC.{} {},c{}[{}];", size, ret, binding.U32(), offset);
}
bool IsInputArray(Stage stage) {
return stage == Stage::Geometry || stage == Stage::TessellationControl ||
stage == Stage::TessellationEval;
}
std::string VertexIndex(EmitContext& ctx, ScalarU32 vertex) {
return IsInputArray(ctx.stage) ? fmt::format("[{}]", vertex) : "";
}
u32 TexCoordIndex(IR::Attribute attr) {
return (static_cast<u32>(attr) - static_cast<u32>(IR::Attribute::FixedFncTexture0S)) / 4;
}
} // Anonymous namespace
void EmitGetCbufU8(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset) {
GetCbuf(ctx, inst, binding, offset, "U8");
}
void EmitGetCbufS8(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset) {
GetCbuf(ctx, inst, binding, offset, "S8");
}
void EmitGetCbufU16(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset) {
GetCbuf(ctx, inst, binding, offset, "U16");
}
void EmitGetCbufS16(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset) {
GetCbuf(ctx, inst, binding, offset, "S16");
}
void EmitGetCbufU32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset) {
GetCbuf(ctx, inst, binding, offset, "U32");
}
void EmitGetCbufF32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset) {
GetCbuf(ctx, inst, binding, offset, "F32");
}
void EmitGetCbufU32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset) {
GetCbuf(ctx, inst, binding, offset, "U32X2");
}
void EmitGetAttribute(EmitContext& ctx, IR::Inst& inst, IR::Attribute attr, ScalarU32 vertex) {
const u32 element{static_cast<u32>(attr) % 4};
const char swizzle{"xyzw"[element]};
if (IR::IsGeneric(attr)) {
const u32 index{IR::GenericAttributeIndex(attr)};
ctx.Add("MOV.F {}.x,in_attr{}{}[0].{};", inst, index, VertexIndex(ctx, vertex), swizzle);
return;
}
if (attr >= IR::Attribute::FixedFncTexture0S && attr <= IR::Attribute::FixedFncTexture9Q) {
const u32 index{TexCoordIndex(attr)};
ctx.Add("MOV.F {}.x,{}.texcoord[{}].{};", inst, ctx.attrib_name, index, swizzle);
return;
}
switch (attr) {
case IR::Attribute::PrimitiveId:
ctx.Add("MOV.S {}.x,primitive.id;", inst);
break;
case IR::Attribute::PositionX:
case IR::Attribute::PositionY:
case IR::Attribute::PositionZ:
case IR::Attribute::PositionW:
if (IsInputArray(ctx.stage)) {
ctx.Add("MOV.F {}.x,vertex_position{}.{};", inst, VertexIndex(ctx, vertex), swizzle);
} else {
ctx.Add("MOV.F {}.x,{}.position.{};", inst, ctx.attrib_name, swizzle);
}
break;
case IR::Attribute::ColorFrontDiffuseR:
case IR::Attribute::ColorFrontDiffuseG:
case IR::Attribute::ColorFrontDiffuseB:
case IR::Attribute::ColorFrontDiffuseA:
ctx.Add("MOV.F {}.x,{}.color.{};", inst, ctx.attrib_name, swizzle);
break;
case IR::Attribute::PointSpriteS:
case IR::Attribute::PointSpriteT:
ctx.Add("MOV.F {}.x,{}.pointcoord.{};", inst, ctx.attrib_name, swizzle);
break;
case IR::Attribute::TessellationEvaluationPointU:
case IR::Attribute::TessellationEvaluationPointV:
ctx.Add("MOV.F {}.x,vertex.tesscoord.{};", inst, swizzle);
break;
case IR::Attribute::InstanceId:
ctx.Add("MOV.S {}.x,{}.instance;", inst, ctx.attrib_name);
break;
case IR::Attribute::VertexId:
ctx.Add("MOV.S {}.x,{}.id;", inst, ctx.attrib_name);
break;
case IR::Attribute::FrontFace:
ctx.Add("CMP.S {}.x,{}.facing.x,0,-1;", inst, ctx.attrib_name);
break;
default:
throw NotImplementedException("Get attribute {}", attr);
}
}
void EmitSetAttribute(EmitContext& ctx, IR::Attribute attr, ScalarF32 value,
[[maybe_unused]] ScalarU32 vertex) {
const u32 element{static_cast<u32>(attr) % 4};
const char swizzle{"xyzw"[element]};
if (IR::IsGeneric(attr)) {
const u32 index{IR::GenericAttributeIndex(attr)};
ctx.Add("MOV.F out_attr{}[0].{},{};", index, swizzle, value);
return;
}
if (attr >= IR::Attribute::FixedFncTexture0S && attr <= IR::Attribute::FixedFncTexture9R) {
const u32 index{TexCoordIndex(attr)};
ctx.Add("MOV.F result.texcoord[{}].{},{};", index, swizzle, value);
return;
}
switch (attr) {
case IR::Attribute::Layer:
if (ctx.stage == Stage::Geometry || ctx.profile.support_viewport_index_layer_non_geometry) {
ctx.Add("MOV.F result.layer.x,{};", value);
} else {
LOG_WARNING(Shader_GLASM,
"Layer stored outside of geometry shader not supported by device");
}
break;
case IR::Attribute::ViewportIndex:
if (ctx.stage == Stage::Geometry || ctx.profile.support_viewport_index_layer_non_geometry) {
ctx.Add("MOV.F result.viewport.x,{};", value);
} else {
LOG_WARNING(Shader_GLASM,
"Viewport stored outside of geometry shader not supported by device");
}
break;
case IR::Attribute::ViewportMask:
// NV_viewport_array2 is required to access result.viewportmask, regardless of shader stage.
if (ctx.profile.support_viewport_index_layer_non_geometry) {
ctx.Add("MOV.F result.viewportmask[0].x,{};", value);
} else {
LOG_WARNING(Shader_GLASM, "Device does not support storing to ViewportMask");
}
break;
case IR::Attribute::PointSize:
ctx.Add("MOV.F result.pointsize.x,{};", value);
break;
case IR::Attribute::PositionX:
case IR::Attribute::PositionY:
case IR::Attribute::PositionZ:
case IR::Attribute::PositionW:
ctx.Add("MOV.F result.position.{},{};", swizzle, value);
break;
case IR::Attribute::ColorFrontDiffuseR:
case IR::Attribute::ColorFrontDiffuseG:
case IR::Attribute::ColorFrontDiffuseB:
case IR::Attribute::ColorFrontDiffuseA:
ctx.Add("MOV.F result.color.{},{};", swizzle, value);
break;
case IR::Attribute::ColorFrontSpecularR:
case IR::Attribute::ColorFrontSpecularG:
case IR::Attribute::ColorFrontSpecularB:
case IR::Attribute::ColorFrontSpecularA:
ctx.Add("MOV.F result.color.secondary.{},{};", swizzle, value);
break;
case IR::Attribute::ColorBackDiffuseR:
case IR::Attribute::ColorBackDiffuseG:
case IR::Attribute::ColorBackDiffuseB:
case IR::Attribute::ColorBackDiffuseA:
ctx.Add("MOV.F result.color.back.{},{};", swizzle, value);
break;
case IR::Attribute::ColorBackSpecularR:
case IR::Attribute::ColorBackSpecularG:
case IR::Attribute::ColorBackSpecularB:
case IR::Attribute::ColorBackSpecularA:
ctx.Add("MOV.F result.color.back.secondary.{},{};", swizzle, value);
break;
case IR::Attribute::FogCoordinate:
ctx.Add("MOV.F result.fogcoord.x,{};", value);
break;
case IR::Attribute::ClipDistance0:
case IR::Attribute::ClipDistance1:
case IR::Attribute::ClipDistance2:
case IR::Attribute::ClipDistance3:
case IR::Attribute::ClipDistance4:
case IR::Attribute::ClipDistance5:
case IR::Attribute::ClipDistance6:
case IR::Attribute::ClipDistance7: {
const u32 index{static_cast<u32>(attr) - static_cast<u32>(IR::Attribute::ClipDistance0)};
ctx.Add("MOV.F result.clip[{}].x,{};", index, value);
break;
}
default:
throw NotImplementedException("Set attribute {}", attr);
}
}
void EmitGetAttributeIndexed(EmitContext& ctx, IR::Inst& inst, ScalarS32 offset, ScalarU32 vertex) {
// RC.x = base_index
// RC.y = masked_index
// RC.z = compare_index
ctx.Add("SHR.S RC.x,{},2;"
"AND.S RC.y,RC.x,3;"
"SHR.S RC.z,{},4;",
offset, offset);
const Register ret{ctx.reg_alloc.Define(inst)};
u32 num_endifs{};
const auto read{[&](u32 compare_index, const std::array<std::string, 4>& values) {
++num_endifs;
ctx.Add("SEQ.S.CC RC.w,RC.z,{};" // compare_index
"IF NE.w;"
// X
"SEQ.S.CC RC.w,RC.y,0;"
"IF NE.w;"
"MOV {}.x,{};"
"ELSE;"
// Y
"SEQ.S.CC RC.w,RC.y,1;"
"IF NE.w;"
"MOV {}.x,{};"
"ELSE;"
// Z
"SEQ.S.CC RC.w,RC.y,2;"
"IF NE.w;"
"MOV {}.x,{};"
"ELSE;"
// W
"MOV {}.x,{};"
"ENDIF;"
"ENDIF;"
"ENDIF;"
"ELSE;",
compare_index, ret, values[0], ret, values[1], ret, values[2], ret, values[3]);
}};
const auto read_swizzled{[&](u32 compare_index, std::string_view value) {
const std::array values{fmt::format("{}.x", value), fmt::format("{}.y", value),
fmt::format("{}.z", value), fmt::format("{}.w", value)};
read(compare_index, values);
}};
if (ctx.info.loads.AnyComponent(IR::Attribute::PositionX)) {
const u32 index{static_cast<u32>(IR::Attribute::PositionX)};
if (IsInputArray(ctx.stage)) {
read_swizzled(index, fmt::format("vertex_position{}", VertexIndex(ctx, vertex)));
} else {
read_swizzled(index, fmt::format("{}.position", ctx.attrib_name));
}
}
for (u32 index = 0; index < static_cast<u32>(IR::NUM_GENERICS); ++index) {
if (!ctx.info.loads.Generic(index)) {
continue;
}
read_swizzled(index, fmt::format("in_attr{}{}[0]", index, VertexIndex(ctx, vertex)));
}
for (u32 i = 0; i < num_endifs; ++i) {
ctx.Add("ENDIF;");
}
}
void EmitSetAttributeIndexed([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] ScalarU32 offset,
[[maybe_unused]] ScalarF32 value, [[maybe_unused]] ScalarU32 vertex) {
throw NotImplementedException("GLASM instruction");
}
void EmitGetPatch(EmitContext& ctx, IR::Inst& inst, IR::Patch patch) {
if (!IR::IsGeneric(patch)) {
throw NotImplementedException("Non-generic patch load");
}
const u32 index{IR::GenericPatchIndex(patch)};
const u32 element{IR::GenericPatchElement(patch)};
const char swizzle{"xyzw"[element]};
const std::string_view out{ctx.stage == Stage::TessellationControl ? ".out" : ""};
ctx.Add("MOV.F {},primitive{}.patch.attrib[{}].{};", inst, out, index, swizzle);
}
void EmitSetPatch(EmitContext& ctx, IR::Patch patch, ScalarF32 value) {
if (IR::IsGeneric(patch)) {
const u32 index{IR::GenericPatchIndex(patch)};
const u32 element{IR::GenericPatchElement(patch)};
ctx.Add("MOV.F result.patch.attrib[{}].{},{};", index, "xyzw"[element], value);
return;
}
switch (patch) {
case IR::Patch::TessellationLodLeft:
case IR::Patch::TessellationLodRight:
case IR::Patch::TessellationLodTop:
case IR::Patch::TessellationLodBottom: {
const u32 index{static_cast<u32>(patch) - u32(IR::Patch::TessellationLodLeft)};
ctx.Add("MOV.F result.patch.tessouter[{}].x,{};", index, value);
break;
}
case IR::Patch::TessellationLodInteriorU:
ctx.Add("MOV.F result.patch.tessinner[0].x,{};", value);
break;
case IR::Patch::TessellationLodInteriorV:
ctx.Add("MOV.F result.patch.tessinner[1].x,{};", value);
break;
default:
throw NotImplementedException("Patch {}", patch);
}
}
void EmitSetFragColor(EmitContext& ctx, u32 index, u32 component, ScalarF32 value) {
ctx.Add("MOV.F frag_color{}.{},{};", index, "xyzw"[component], value);
}
void EmitSetSampleMask(EmitContext& ctx, ScalarS32 value) {
ctx.Add("MOV.S result.samplemask.x,{};", value);
}
void EmitSetFragDepth(EmitContext& ctx, ScalarF32 value) {
ctx.Add("MOV.F result.depth.z,{};", value);
}
void EmitLoadLocal(EmitContext& ctx, IR::Inst& inst, ScalarU32 word_offset) {
ctx.Add("MOV.U {},lmem[{}].x;", inst, word_offset);
}
void EmitWriteLocal(EmitContext& ctx, ScalarU32 word_offset, ScalarU32 value) {
ctx.Add("MOV.U lmem[{}].x,{};", word_offset, value);
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,231 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <string_view>
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/modifiers.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
namespace {
std::string_view FpRounding(IR::FpRounding fp_rounding) {
switch (fp_rounding) {
case IR::FpRounding::DontCare:
return "";
case IR::FpRounding::RN:
return ".ROUND";
case IR::FpRounding::RZ:
return ".TRUNC";
case IR::FpRounding::RM:
return ".FLR";
case IR::FpRounding::RP:
return ".CEIL";
}
throw InvalidArgument("Invalid floating-point rounding {}", fp_rounding);
}
template <typename InputType>
void Convert(EmitContext& ctx, IR::Inst& inst, InputType value, std::string_view dest,
std::string_view src, bool is_long_result) {
const std::string_view fp_rounding{FpRounding(inst.Flags<IR::FpControl>().rounding)};
const auto ret{is_long_result ? ctx.reg_alloc.LongDefine(inst) : ctx.reg_alloc.Define(inst)};
ctx.Add("CVT.{}.{}{} {}.x,{};", dest, src, fp_rounding, ret, value);
}
} // Anonymous namespace
void EmitConvertS16F16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "S16", "F16", false);
}
void EmitConvertS16F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Convert(ctx, inst, value, "S16", "F32", false);
}
void EmitConvertS16F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
Convert(ctx, inst, value, "S16", "F64", false);
}
void EmitConvertS32F16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "S32", "F16", false);
}
void EmitConvertS32F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Convert(ctx, inst, value, "S32", "F32", false);
}
void EmitConvertS32F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
Convert(ctx, inst, value, "S32", "F64", false);
}
void EmitConvertS64F16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "S64", "F16", true);
}
void EmitConvertS64F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Convert(ctx, inst, value, "S64", "F32", true);
}
void EmitConvertS64F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
Convert(ctx, inst, value, "S64", "F64", true);
}
void EmitConvertU16F16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "U16", "F16", false);
}
void EmitConvertU16F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Convert(ctx, inst, value, "U16", "F32", false);
}
void EmitConvertU16F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
Convert(ctx, inst, value, "U16", "F64", false);
}
void EmitConvertU32F16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "U32", "F16", false);
}
void EmitConvertU32F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Convert(ctx, inst, value, "U32", "F32", false);
}
void EmitConvertU32F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
Convert(ctx, inst, value, "U32", "F64", false);
}
void EmitConvertU64F16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "U64", "F16", true);
}
void EmitConvertU64F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Convert(ctx, inst, value, "U64", "F32", true);
}
void EmitConvertU64F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
Convert(ctx, inst, value, "U64", "F64", true);
}
void EmitConvertU64U32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value) {
Convert(ctx, inst, value, "U64", "U32", true);
}
void EmitConvertU32U64(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "U32", "U64", false);
}
void EmitConvertF16F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Convert(ctx, inst, value, "F16", "F32", false);
}
void EmitConvertF32F16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F32", "F16", false);
}
void EmitConvertF32F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
Convert(ctx, inst, value, "F32", "F64", false);
}
void EmitConvertF64F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Convert(ctx, inst, value, "F64", "F32", true);
}
void EmitConvertF16S8(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F16", "S8", false);
}
void EmitConvertF16S16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F16", "S16", false);
}
void EmitConvertF16S32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
Convert(ctx, inst, value, "F16", "S32", false);
}
void EmitConvertF16S64(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F16", "S64", false);
}
void EmitConvertF16U8(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F16", "U8", false);
}
void EmitConvertF16U16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F16", "U16", false);
}
void EmitConvertF16U32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value) {
Convert(ctx, inst, value, "F16", "U32", false);
}
void EmitConvertF16U64(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F16", "U64", false);
}
void EmitConvertF32S8(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F32", "S8", false);
}
void EmitConvertF32S16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F32", "S16", false);
}
void EmitConvertF32S32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
Convert(ctx, inst, value, "F32", "S32", false);
}
void EmitConvertF32S64(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F32", "S64", false);
}
void EmitConvertF32U8(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F32", "U8", false);
}
void EmitConvertF32U16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F32", "U16", false);
}
void EmitConvertF32U32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value) {
Convert(ctx, inst, value, "F32", "U32", false);
}
void EmitConvertF32U64(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F32", "U64", false);
}
void EmitConvertF64S8(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F64", "S8", true);
}
void EmitConvertF64S16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F64", "S16", true);
}
void EmitConvertF64S32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
Convert(ctx, inst, value, "F64", "S32", true);
}
void EmitConvertF64S64(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F64", "S64", true);
}
void EmitConvertF64U8(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F64", "U8", true);
}
void EmitConvertF64U16(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F64", "U16", true);
}
void EmitConvertF64U32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value) {
Convert(ctx, inst, value, "F64", "U32", true);
}
void EmitConvertF64U64(EmitContext& ctx, IR::Inst& inst, Register value) {
Convert(ctx, inst, value, "F64", "U64", true);
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,414 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <string_view>
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/modifiers.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
namespace {
template <typename InputType>
void Compare(EmitContext& ctx, IR::Inst& inst, InputType lhs, InputType rhs, std::string_view op,
std::string_view type, bool ordered, bool inequality = false) {
const Register ret{ctx.reg_alloc.Define(inst)};
ctx.Add("{}.{} RC.x,{},{};", op, type, lhs, rhs);
if (ordered && inequality) {
ctx.Add("SEQ.{} RC.y,{},{};"
"SEQ.{} RC.z,{},{};"
"AND.U RC.x,RC.x,RC.y;"
"AND.U RC.x,RC.x,RC.z;"
"SNE.S {}.x,RC.x,0;",
type, lhs, lhs, type, rhs, rhs, ret);
} else if (ordered) {
ctx.Add("SNE.S {}.x,RC.x,0;", ret);
} else {
ctx.Add("SNE.{} RC.y,{},{};"
"SNE.{} RC.z,{},{};"
"OR.U RC.x,RC.x,RC.y;"
"OR.U RC.x,RC.x,RC.z;"
"SNE.S {}.x,RC.x,0;",
type, lhs, lhs, type, rhs, rhs, ret);
}
}
template <typename InputType>
void Clamp(EmitContext& ctx, Register ret, InputType value, InputType min_value,
InputType max_value, std::string_view type) {
// Call MAX first to properly clamp nan to min_value instead
ctx.Add("MAX.{} RC.x,{},{};"
"MIN.{} {}.x,RC.x,{};",
type, min_value, value, type, ret, max_value);
}
std::string_view Precise(IR::Inst& inst) {
const bool precise{inst.Flags<IR::FpControl>().no_contraction};
return precise ? ".PREC" : "";
}
} // Anonymous namespace
void EmitFPAbs16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] IR::Inst& inst,
[[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPAbs32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("MOV.F {}.x,|{}|;", inst, value);
}
void EmitFPAbs64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
ctx.LongAdd("MOV.F64 {}.x,|{}|;", inst, value);
}
void EmitFPAdd16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] IR::Inst& inst,
[[maybe_unused]] Register a, [[maybe_unused]] Register b) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPAdd32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b) {
ctx.Add("ADD.F{} {}.x,{},{};", Precise(inst), ctx.reg_alloc.Define(inst), a, b);
}
void EmitFPAdd64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b) {
ctx.Add("ADD.F64{} {}.x,{},{};", Precise(inst), ctx.reg_alloc.LongDefine(inst), a, b);
}
void EmitFPFma16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] IR::Inst& inst,
[[maybe_unused]] Register a, [[maybe_unused]] Register b,
[[maybe_unused]] Register c) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPFma32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b, ScalarF32 c) {
ctx.Add("MAD.F{} {}.x,{},{},{};", Precise(inst), ctx.reg_alloc.Define(inst), a, b, c);
}
void EmitFPFma64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b, ScalarF64 c) {
ctx.Add("MAD.F64{} {}.x,{},{},{};", Precise(inst), ctx.reg_alloc.LongDefine(inst), a, b, c);
}
void EmitFPMax32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b) {
ctx.Add("MAX.F {}.x,{},{};", inst, a, b);
}
void EmitFPMax64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b) {
ctx.LongAdd("MAX.F64 {}.x,{},{};", inst, a, b);
}
void EmitFPMin32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b) {
ctx.Add("MIN.F {}.x,{},{};", inst, a, b);
}
void EmitFPMin64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b) {
ctx.LongAdd("MIN.F64 {}.x,{},{};", inst, a, b);
}
void EmitFPMul16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] IR::Inst& inst,
[[maybe_unused]] Register a, [[maybe_unused]] Register b) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPMul32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b) {
ctx.Add("MUL.F{} {}.x,{},{};", Precise(inst), ctx.reg_alloc.Define(inst), a, b);
}
void EmitFPMul64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b) {
ctx.Add("MUL.F64{} {}.x,{},{};", Precise(inst), ctx.reg_alloc.LongDefine(inst), a, b);
}
void EmitFPNeg16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPNeg32(EmitContext& ctx, IR::Inst& inst, ScalarRegister value) {
ctx.Add("MOV.F {}.x,-{};", inst, value);
}
void EmitFPNeg64(EmitContext& ctx, IR::Inst& inst, Register value) {
ctx.LongAdd("MOV.F64 {}.x,-{};", inst, value);
}
void EmitFPSin(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("SIN {}.x,{};", inst, value);
}
void EmitFPCos(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("COS {}.x,{};", inst, value);
}
void EmitFPExp2(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("EX2 {}.x,{};", inst, value);
}
void EmitFPLog2(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("LG2 {}.x,{};", inst, value);
}
void EmitFPRecip32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("RCP {}.x,{};", inst, value);
}
void EmitFPRecip64([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPRecipSqrt32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("RSQ {}.x,{};", inst, value);
}
void EmitFPRecipSqrt64([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPSqrt(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
const Register ret{ctx.reg_alloc.Define(inst)};
ctx.Add("RSQ RC.x,{};RCP {}.x,RC.x;", value, ret);
}
void EmitFPSaturate16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPSaturate32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("MOV.F.SAT {}.x,{};", inst, value);
}
void EmitFPSaturate64([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPClamp16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value,
[[maybe_unused]] Register min_value, [[maybe_unused]] Register max_value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPClamp32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value, ScalarF32 min_value,
ScalarF32 max_value) {
Clamp(ctx, ctx.reg_alloc.Define(inst), value, min_value, max_value, "F");
}
void EmitFPClamp64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value, ScalarF64 min_value,
ScalarF64 max_value) {
Clamp(ctx, ctx.reg_alloc.LongDefine(inst), value, min_value, max_value, "F64");
}
void EmitFPRoundEven16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPRoundEven32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("ROUND.F {}.x,{};", inst, value);
}
void EmitFPRoundEven64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
ctx.LongAdd("ROUND.F64 {}.x,{};", inst, value);
}
void EmitFPFloor16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPFloor32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("FLR.F {}.x,{};", inst, value);
}
void EmitFPFloor64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
ctx.LongAdd("FLR.F64 {}.x,{};", inst, value);
}
void EmitFPCeil16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPCeil32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("CEIL.F {}.x,{};", inst, value);
}
void EmitFPCeil64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
ctx.LongAdd("CEIL.F64 {}.x,{};", inst, value);
}
void EmitFPTrunc16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPTrunc32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
ctx.Add("TRUNC.F {}.x,{};", inst, value);
}
void EmitFPTrunc64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
ctx.LongAdd("TRUNC.F64 {}.x,{};", inst, value);
}
void EmitFPOrdEqual16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPOrdEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SEQ", "F", true);
}
void EmitFPOrdEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SEQ", "F64", true);
}
void EmitFPUnordEqual16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPUnordEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SEQ", "F", false);
}
void EmitFPUnordEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SEQ", "F64", false);
}
void EmitFPOrdNotEqual16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPOrdNotEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SNE", "F", true, true);
}
void EmitFPOrdNotEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SNE", "F64", true, true);
}
void EmitFPUnordNotEqual16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPUnordNotEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SNE", "F", false, true);
}
void EmitFPUnordNotEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SNE", "F64", false, true);
}
void EmitFPOrdLessThan16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPOrdLessThan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SLT", "F", true);
}
void EmitFPOrdLessThan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SLT", "F64", true);
}
void EmitFPUnordLessThan16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPUnordLessThan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SLT", "F", false);
}
void EmitFPUnordLessThan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SLT", "F64", false);
}
void EmitFPOrdGreaterThan16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPOrdGreaterThan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SGT", "F", true);
}
void EmitFPOrdGreaterThan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SGT", "F64", true);
}
void EmitFPUnordGreaterThan16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPUnordGreaterThan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SGT", "F", false);
}
void EmitFPUnordGreaterThan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SGT", "F64", false);
}
void EmitFPOrdLessThanEqual16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPOrdLessThanEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SLE", "F", true);
}
void EmitFPOrdLessThanEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SLE", "F64", true);
}
void EmitFPUnordLessThanEqual16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPUnordLessThanEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SLE", "F", false);
}
void EmitFPUnordLessThanEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SLE", "F64", false);
}
void EmitFPOrdGreaterThanEqual16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPOrdGreaterThanEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SGE", "F", true);
}
void EmitFPOrdGreaterThanEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SGE", "F64", true);
}
void EmitFPUnordGreaterThanEqual16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register lhs,
[[maybe_unused]] Register rhs) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPUnordGreaterThanEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs) {
Compare(ctx, inst, lhs, rhs, "SGE", "F", false);
}
void EmitFPUnordGreaterThanEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs) {
Compare(ctx, inst, lhs, rhs, "SGE", "F64", false);
}
void EmitFPIsNan16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitFPIsNan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value) {
Compare(ctx, inst, value, value, "SNE", "F", true, false);
}
void EmitFPIsNan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value) {
Compare(ctx, inst, value, value, "SNE", "F64", true, false);
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,850 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <utility>
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/modifiers.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
namespace {
struct ScopedRegister {
ScopedRegister() = default;
ScopedRegister(RegAlloc& reg_alloc_) : reg_alloc{&reg_alloc_}, reg{reg_alloc->AllocReg()} {}
~ScopedRegister() {
if (reg_alloc) {
reg_alloc->FreeReg(reg);
}
}
ScopedRegister& operator=(ScopedRegister&& rhs) noexcept {
if (reg_alloc) {
reg_alloc->FreeReg(reg);
}
reg_alloc = std::exchange(rhs.reg_alloc, nullptr);
reg = rhs.reg;
return *this;
}
ScopedRegister(ScopedRegister&& rhs) noexcept
: reg_alloc{std::exchange(rhs.reg_alloc, nullptr)}, reg{rhs.reg} {}
ScopedRegister& operator=(const ScopedRegister&) = delete;
ScopedRegister(const ScopedRegister&) = delete;
RegAlloc* reg_alloc{};
Register reg;
};
std::string Texture(EmitContext& ctx, IR::TextureInstInfo info,
[[maybe_unused]] const IR::Value& index) {
// FIXME: indexed reads
if (info.type == TextureType::Buffer) {
return fmt::format("texture[{}]", ctx.texture_buffer_bindings.at(info.descriptor_index));
} else {
return fmt::format("texture[{}]", ctx.texture_bindings.at(info.descriptor_index));
}
}
std::string Image(EmitContext& ctx, IR::TextureInstInfo info,
[[maybe_unused]] const IR::Value& index) {
// FIXME: indexed reads
if (info.type == TextureType::Buffer) {
return fmt::format("image[{}]", ctx.image_buffer_bindings.at(info.descriptor_index));
} else {
return fmt::format("image[{}]", ctx.image_bindings.at(info.descriptor_index));
}
}
std::string_view TextureType(IR::TextureInstInfo info) {
if (info.is_depth) {
switch (info.type) {
case TextureType::Color1D:
return "SHADOW1D";
case TextureType::ColorArray1D:
return "SHADOWARRAY1D";
case TextureType::Color2D:
return "SHADOW2D";
case TextureType::ColorArray2D:
return "SHADOWARRAY2D";
case TextureType::Color3D:
return "SHADOW3D";
case TextureType::ColorCube:
return "SHADOWCUBE";
case TextureType::ColorArrayCube:
return "SHADOWARRAYCUBE";
case TextureType::Buffer:
return "SHADOWBUFFER";
}
} else {
switch (info.type) {
case TextureType::Color1D:
return "1D";
case TextureType::ColorArray1D:
return "ARRAY1D";
case TextureType::Color2D:
return "2D";
case TextureType::ColorArray2D:
return "ARRAY2D";
case TextureType::Color3D:
return "3D";
case TextureType::ColorCube:
return "CUBE";
case TextureType::ColorArrayCube:
return "ARRAYCUBE";
case TextureType::Buffer:
return "BUFFER";
}
}
throw InvalidArgument("Invalid texture type {}", info.type.Value());
}
std::string Offset(EmitContext& ctx, const IR::Value& offset) {
if (offset.IsEmpty()) {
return "";
}
return fmt::format(",offset({})", Register{ctx.reg_alloc.Consume(offset)});
}
std::pair<ScopedRegister, ScopedRegister> AllocOffsetsRegs(EmitContext& ctx,
const IR::Value& offset2) {
if (offset2.IsEmpty()) {
return {};
} else {
return {ctx.reg_alloc, ctx.reg_alloc};
}
}
void SwizzleOffsets(EmitContext& ctx, Register off_x, Register off_y, const IR::Value& offset1,
const IR::Value& offset2) {
const Register offsets_a{ctx.reg_alloc.Consume(offset1)};
const Register offsets_b{ctx.reg_alloc.Consume(offset2)};
// Input swizzle: [XYXY] [XYXY]
// Output swizzle: [XXXX] [YYYY]
ctx.Add("MOV {}.x,{}.x;"
"MOV {}.y,{}.z;"
"MOV {}.z,{}.x;"
"MOV {}.w,{}.z;"
"MOV {}.x,{}.y;"
"MOV {}.y,{}.w;"
"MOV {}.z,{}.y;"
"MOV {}.w,{}.w;",
off_x, offsets_a, off_x, offsets_a, off_x, offsets_b, off_x, offsets_b, off_y,
offsets_a, off_y, offsets_a, off_y, offsets_b, off_y, offsets_b);
}
std::string GradOffset(const IR::Value& offset) {
if (offset.IsImmediate()) {
LOG_WARNING(Shader_GLASM, "Gradient offset is a scalar immediate");
return "";
}
IR::Inst* const vector{offset.InstRecursive()};
if (!vector->AreAllArgsImmediates()) {
LOG_WARNING(Shader_GLASM, "Gradient offset vector is not immediate");
return "";
}
switch (vector->NumArgs()) {
case 1:
return fmt::format(",({})", static_cast<s32>(vector->Arg(0).U32()));
case 2:
return fmt::format(",({},{})", static_cast<s32>(vector->Arg(0).U32()),
static_cast<s32>(vector->Arg(1).U32()));
default:
throw LogicError("Invalid number of gradient offsets {}", vector->NumArgs());
}
}
std::pair<std::string, ScopedRegister> Coord(EmitContext& ctx, const IR::Value& coord) {
if (coord.IsImmediate()) {
ScopedRegister scoped_reg(ctx.reg_alloc);
ctx.Add("MOV.U {}.x,{};", scoped_reg.reg, ScalarU32{ctx.reg_alloc.Consume(coord)});
return {fmt::to_string(scoped_reg.reg), std::move(scoped_reg)};
}
std::string coord_vec{fmt::to_string(Register{ctx.reg_alloc.Consume(coord)})};
if (coord.InstRecursive()->HasUses()) {
// Move non-dead coords to a separate register, although this should never happen because
// vectors are only assembled for immediate texture instructions
ctx.Add("MOV.F RC,{};", coord_vec);
coord_vec = "RC";
}
return {std::move(coord_vec), ScopedRegister{}};
}
void StoreSparse(EmitContext& ctx, IR::Inst* sparse_inst) {
if (!sparse_inst) {
return;
}
const Register sparse_ret{ctx.reg_alloc.Define(*sparse_inst)};
ctx.Add("MOV.S {},-1;"
"MOV.S {}(NONRESIDENT),0;",
sparse_ret, sparse_ret);
}
std::string_view FormatStorage(ImageFormat format) {
switch (format) {
case ImageFormat::Typeless:
return "U";
case ImageFormat::R8_UINT:
return "U8";
case ImageFormat::R8_SINT:
return "S8";
case ImageFormat::R16_UINT:
return "U16";
case ImageFormat::R16_SINT:
return "S16";
case ImageFormat::R32_UINT:
return "U32";
case ImageFormat::R32G32_UINT:
return "U32X2";
case ImageFormat::R32G32B32A32_UINT:
return "U32X4";
}
throw InvalidArgument("Invalid image format {}", format);
}
template <typename T>
void ImageAtomic(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord, T value,
std::string_view op) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
const std::string_view type{TextureType(info)};
const std::string image{Image(ctx, info, index)};
const Register ret{ctx.reg_alloc.Define(inst)};
ctx.Add("ATOMIM.{} {},{},{},{},{};", op, ret, value, coord, image, type);
}
IR::Inst* PrepareSparse(IR::Inst& inst) {
const auto sparse_inst{inst.GetAssociatedPseudoOperation(IR::Opcode::GetSparseFromOp)};
if (sparse_inst) {
sparse_inst->Invalidate();
}
return sparse_inst;
}
} // Anonymous namespace
void EmitImageSampleImplicitLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, Register bias_lc, const IR::Value& offset) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view lod_clamp_mod{info.has_lod_clamp ? ".LODCLAMP" : ""};
const std::string_view type{TextureType(info)};
const std::string texture{Texture(ctx, info, index)};
const std::string offset_vec{Offset(ctx, offset)};
const auto [coord_vec, coord_alloc]{Coord(ctx, coord)};
const Register ret{ctx.reg_alloc.Define(inst)};
if (info.has_bias) {
if (info.type == TextureType::ColorArrayCube) {
ctx.Add("TXB.F{}{} {},{},{},{},ARRAYCUBE{};", lod_clamp_mod, sparse_mod, ret, coord_vec,
bias_lc, texture, offset_vec);
} else {
if (info.has_lod_clamp) {
ctx.Add("MOV.F {}.w,{}.x;"
"TXB.F.LODCLAMP{} {},{},{}.y,{},{}{};",
coord_vec, bias_lc, sparse_mod, ret, coord_vec, bias_lc, texture, type,
offset_vec);
} else {
ctx.Add("MOV.F {}.w,{}.x;"
"TXB.F{} {},{},{},{}{};",
coord_vec, bias_lc, sparse_mod, ret, coord_vec, texture, type, offset_vec);
}
}
} else {
if (info.has_lod_clamp && info.type == TextureType::ColorArrayCube) {
ctx.Add("TEX.F.LODCLAMP{} {},{},{},{},ARRAYCUBE{};", sparse_mod, ret, coord_vec,
bias_lc, texture, offset_vec);
} else {
ctx.Add("TEX.F{}{} {},{},{},{}{};", lod_clamp_mod, sparse_mod, ret, coord_vec, texture,
type, offset_vec);
}
}
StoreSparse(ctx, sparse_inst);
}
void EmitImageSampleExplicitLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, ScalarF32 lod, const IR::Value& offset) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view type{TextureType(info)};
const std::string texture{Texture(ctx, info, index)};
const std::string offset_vec{Offset(ctx, offset)};
const auto [coord_vec, coord_alloc]{Coord(ctx, coord)};
const Register ret{ctx.reg_alloc.Define(inst)};
if (info.type == TextureType::ColorArrayCube) {
ctx.Add("TXL.F{} {},{},{},{},ARRAYCUBE{};", sparse_mod, ret, coord_vec, lod, texture,
offset_vec);
} else {
ctx.Add("MOV.F {}.w,{};"
"TXL.F{} {},{},{},{}{};",
coord_vec, lod, sparse_mod, ret, coord_vec, texture, type, offset_vec);
}
StoreSparse(ctx, sparse_inst);
}
void EmitImageSampleDrefImplicitLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& dref,
const IR::Value& bias_lc, const IR::Value& offset) {
// Allocate early to avoid aliases
const auto info{inst.Flags<IR::TextureInstInfo>()};
ScopedRegister staging;
if (info.type == TextureType::ColorArrayCube) {
staging = ScopedRegister{ctx.reg_alloc};
}
const ScalarF32 dref_val{ctx.reg_alloc.Consume(dref)};
const Register bias_lc_vec{ctx.reg_alloc.Consume(bias_lc)};
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view type{TextureType(info)};
const std::string texture{Texture(ctx, info, index)};
const std::string offset_vec{Offset(ctx, offset)};
const auto [coord_vec, coord_alloc]{Coord(ctx, coord)};
const Register ret{ctx.reg_alloc.Define(inst)};
if (info.has_bias) {
if (info.has_lod_clamp) {
switch (info.type) {
case TextureType::Color1D:
case TextureType::ColorArray1D:
case TextureType::Color2D:
ctx.Add("MOV.F {}.z,{};"
"MOV.F {}.w,{}.x;"
"TXB.F.LODCLAMP{} {},{},{}.y,{},{}{};",
coord_vec, dref_val, coord_vec, bias_lc_vec, sparse_mod, ret, coord_vec,
bias_lc_vec, texture, type, offset_vec);
break;
case TextureType::ColorArray2D:
case TextureType::ColorCube:
ctx.Add("MOV.F {}.w,{};"
"TXB.F.LODCLAMP{} {},{},{},{},{}{};",
coord_vec, dref_val, sparse_mod, ret, coord_vec, bias_lc_vec, texture, type,
offset_vec);
break;
default:
throw NotImplementedException("Invalid type {} with bias and lod clamp",
info.type.Value());
}
} else {
switch (info.type) {
case TextureType::Color1D:
case TextureType::ColorArray1D:
case TextureType::Color2D:
ctx.Add("MOV.F {}.z,{};"
"MOV.F {}.w,{}.x;"
"TXB.F{} {},{},{},{}{};",
coord_vec, dref_val, coord_vec, bias_lc_vec, sparse_mod, ret, coord_vec,
texture, type, offset_vec);
break;
case TextureType::ColorArray2D:
case TextureType::ColorCube:
ctx.Add("MOV.F {}.w,{};"
"TXB.F{} {},{},{},{},{}{};",
coord_vec, dref_val, sparse_mod, ret, coord_vec, bias_lc_vec, texture, type,
offset_vec);
break;
case TextureType::ColorArrayCube:
ctx.Add("MOV.F {}.x,{};"
"MOV.F {}.y,{}.x;"
"TXB.F{} {},{},{},{},{}{};",
staging.reg, dref_val, staging.reg, bias_lc_vec, sparse_mod, ret, coord_vec,
staging.reg, texture, type, offset_vec);
break;
default:
throw NotImplementedException("Invalid type {}", info.type.Value());
}
}
} else {
if (info.has_lod_clamp) {
if (info.type != TextureType::ColorArrayCube) {
const bool w_swizzle{info.type == TextureType::ColorArray2D ||
info.type == TextureType::ColorCube};
const char dref_swizzle{w_swizzle ? 'w' : 'z'};
ctx.Add("MOV.F {}.{},{};"
"TEX.F.LODCLAMP{} {},{},{},{},{}{};",
coord_vec, dref_swizzle, dref_val, sparse_mod, ret, coord_vec, bias_lc_vec,
texture, type, offset_vec);
} else {
ctx.Add("MOV.F {}.x,{};"
"MOV.F {}.y,{};"
"TEX.F.LODCLAMP{} {},{},{},{},{}{};",
staging.reg, dref_val, staging.reg, bias_lc_vec, sparse_mod, ret, coord_vec,
staging.reg, texture, type, offset_vec);
}
} else {
if (info.type != TextureType::ColorArrayCube) {
const bool w_swizzle{info.type == TextureType::ColorArray2D ||
info.type == TextureType::ColorCube};
const char dref_swizzle{w_swizzle ? 'w' : 'z'};
ctx.Add("MOV.F {}.{},{};"
"TEX.F{} {},{},{},{}{};",
coord_vec, dref_swizzle, dref_val, sparse_mod, ret, coord_vec, texture,
type, offset_vec);
} else {
ctx.Add("TEX.F{} {},{},{},{},{}{};", sparse_mod, ret, coord_vec, dref_val, texture,
type, offset_vec);
}
}
}
StoreSparse(ctx, sparse_inst);
}
void EmitImageSampleDrefExplicitLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& dref,
const IR::Value& lod, const IR::Value& offset) {
// Allocate early to avoid aliases
const auto info{inst.Flags<IR::TextureInstInfo>()};
ScopedRegister staging;
if (info.type == TextureType::ColorArrayCube) {
staging = ScopedRegister{ctx.reg_alloc};
}
const ScalarF32 dref_val{ctx.reg_alloc.Consume(dref)};
const ScalarF32 lod_val{ctx.reg_alloc.Consume(lod)};
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view type{TextureType(info)};
const std::string texture{Texture(ctx, info, index)};
const std::string offset_vec{Offset(ctx, offset)};
const auto [coord_vec, coord_alloc]{Coord(ctx, coord)};
const Register ret{ctx.reg_alloc.Define(inst)};
switch (info.type) {
case TextureType::Color1D:
case TextureType::ColorArray1D:
case TextureType::Color2D:
ctx.Add("MOV.F {}.z,{};"
"MOV.F {}.w,{};"
"TXL.F{} {},{},{},{}{};",
coord_vec, dref_val, coord_vec, lod_val, sparse_mod, ret, coord_vec, texture, type,
offset_vec);
break;
case TextureType::ColorArray2D:
case TextureType::ColorCube:
ctx.Add("MOV.F {}.w,{};"
"TXL.F{} {},{},{},{},{}{};",
coord_vec, dref_val, sparse_mod, ret, coord_vec, lod_val, texture, type,
offset_vec);
break;
case TextureType::ColorArrayCube:
ctx.Add("MOV.F {}.x,{};"
"MOV.F {}.y,{};"
"TXL.F{} {},{},{},{},{}{};",
staging.reg, dref_val, staging.reg, lod_val, sparse_mod, ret, coord_vec,
staging.reg, texture, type, offset_vec);
break;
default:
throw NotImplementedException("Invalid type {}", info.type.Value());
}
StoreSparse(ctx, sparse_inst);
}
void EmitImageGather(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& offset, const IR::Value& offset2) {
// Allocate offsets early so they don't overwrite any consumed register
const auto [off_x, off_y]{AllocOffsetsRegs(ctx, offset2)};
const auto info{inst.Flags<IR::TextureInstInfo>()};
const char comp{"xyzw"[info.gather_component]};
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view type{TextureType(info)};
const std::string texture{Texture(ctx, info, index)};
const Register coord_vec{ctx.reg_alloc.Consume(coord)};
const Register ret{ctx.reg_alloc.Define(inst)};
if (offset2.IsEmpty()) {
const std::string offset_vec{Offset(ctx, offset)};
ctx.Add("TXG.F{} {},{},{}.{},{}{};", sparse_mod, ret, coord_vec, texture, comp, type,
offset_vec);
} else {
SwizzleOffsets(ctx, off_x.reg, off_y.reg, offset, offset2);
ctx.Add("TXGO.F{} {},{},{},{},{}.{},{};", sparse_mod, ret, coord_vec, off_x.reg, off_y.reg,
texture, comp, type);
}
StoreSparse(ctx, sparse_inst);
}
void EmitImageGatherDref(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& offset, const IR::Value& offset2,
const IR::Value& dref) {
// FIXME: This instruction is not working as expected
// Allocate offsets early so they don't overwrite any consumed register
const auto [off_x, off_y]{AllocOffsetsRegs(ctx, offset2)};
const auto info{inst.Flags<IR::TextureInstInfo>()};
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view type{TextureType(info)};
const std::string texture{Texture(ctx, info, index)};
const Register coord_vec{ctx.reg_alloc.Consume(coord)};
const ScalarF32 dref_value{ctx.reg_alloc.Consume(dref)};
const Register ret{ctx.reg_alloc.Define(inst)};
std::string args;
switch (info.type) {
case TextureType::Color2D:
ctx.Add("MOV.F {}.z,{};", coord_vec, dref_value);
args = fmt::to_string(coord_vec);
break;
case TextureType::ColorArray2D:
case TextureType::ColorCube:
ctx.Add("MOV.F {}.w,{};", coord_vec, dref_value);
args = fmt::to_string(coord_vec);
break;
case TextureType::ColorArrayCube:
args = fmt::format("{},{}", coord_vec, dref_value);
break;
default:
throw NotImplementedException("Invalid type {}", info.type.Value());
}
if (offset2.IsEmpty()) {
const std::string offset_vec{Offset(ctx, offset)};
ctx.Add("TXG.F{} {},{},{},{}{};", sparse_mod, ret, args, texture, type, offset_vec);
} else {
SwizzleOffsets(ctx, off_x.reg, off_y.reg, offset, offset2);
ctx.Add("TXGO.F{} {},{},{},{},{},{};", sparse_mod, ret, args, off_x.reg, off_y.reg, texture,
type);
}
StoreSparse(ctx, sparse_inst);
}
void EmitImageFetch(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& offset, ScalarS32 lod, ScalarS32 ms) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view type{TextureType(info)};
const std::string texture{Texture(ctx, info, index)};
const std::string offset_vec{Offset(ctx, offset)};
const auto [coord_vec, coord_alloc]{Coord(ctx, coord)};
const Register ret{ctx.reg_alloc.Define(inst)};
if (info.type == TextureType::Buffer) {
ctx.Add("TXF.F{} {},{},{},{}{};", sparse_mod, ret, coord_vec, texture, type, offset_vec);
} else if (ms.type != Type::Void) {
ctx.Add("MOV.S {}.w,{};"
"TXFMS.F{} {},{},{},{}{};",
coord_vec, ms, sparse_mod, ret, coord_vec, texture, type, offset_vec);
} else {
ctx.Add("MOV.S {}.w,{};"
"TXF.F{} {},{},{},{}{};",
coord_vec, lod, sparse_mod, ret, coord_vec, texture, type, offset_vec);
}
StoreSparse(ctx, sparse_inst);
}
void EmitImageQueryDimensions(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
ScalarS32 lod) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
const std::string texture{Texture(ctx, info, index)};
const std::string_view type{TextureType(info)};
ctx.Add("TXQ {},{},{},{};", inst, lod, texture, type);
}
void EmitImageQueryLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
const std::string texture{Texture(ctx, info, index)};
const std::string_view type{TextureType(info)};
ctx.Add("LOD.F {},{},{},{};", inst, coord, texture, type);
}
void EmitImageGradient(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& derivatives,
const IR::Value& offset, const IR::Value& lod_clamp) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
ScopedRegister dpdx, dpdy;
const bool multi_component{info.num_derivates > 1 || info.has_lod_clamp};
if (multi_component) {
// Allocate this early to avoid aliasing other registers
dpdx = ScopedRegister{ctx.reg_alloc};
dpdy = ScopedRegister{ctx.reg_alloc};
}
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view type{TextureType(info)};
const std::string texture{Texture(ctx, info, index)};
const std::string offset_vec{GradOffset(offset)};
const Register coord_vec{ctx.reg_alloc.Consume(coord)};
const Register derivatives_vec{ctx.reg_alloc.Consume(derivatives)};
const Register ret{ctx.reg_alloc.Define(inst)};
if (multi_component) {
ctx.Add("MOV.F {}.x,{}.x;"
"MOV.F {}.y,{}.z;"
"MOV.F {}.x,{}.y;"
"MOV.F {}.y,{}.w;",
dpdx.reg, derivatives_vec, dpdx.reg, derivatives_vec, dpdy.reg, derivatives_vec,
dpdy.reg, derivatives_vec);
if (info.has_lod_clamp) {
const ScalarF32 lod_clamp_value{ctx.reg_alloc.Consume(lod_clamp)};
ctx.Add("MOV.F {}.w,{};"
"TXD.F.LODCLAMP{} {},{},{},{},{},{}{};",
dpdy.reg, lod_clamp_value, sparse_mod, ret, coord_vec, dpdx.reg, dpdy.reg,
texture, type, offset_vec);
} else {
ctx.Add("TXD.F{} {},{},{},{},{},{}{};", sparse_mod, ret, coord_vec, dpdx.reg, dpdy.reg,
texture, type, offset_vec);
}
} else {
ctx.Add("TXD.F{} {},{},{}.x,{}.y,{},{}{};", sparse_mod, ret, coord_vec, derivatives_vec,
derivatives_vec, texture, type, offset_vec);
}
StoreSparse(ctx, sparse_inst);
}
void EmitImageRead(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
const auto sparse_inst{PrepareSparse(inst)};
const std::string_view format{FormatStorage(info.image_format)};
const std::string_view sparse_mod{sparse_inst ? ".SPARSE" : ""};
const std::string_view type{TextureType(info)};
const std::string image{Image(ctx, info, index)};
const Register ret{ctx.reg_alloc.Define(inst)};
ctx.Add("LOADIM.{}{} {},{},{},{};", format, sparse_mod, ret, coord, image, type);
StoreSparse(ctx, sparse_inst);
}
void EmitImageWrite(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
Register color) {
const auto info{inst.Flags<IR::TextureInstInfo>()};
const std::string_view format{FormatStorage(info.image_format)};
const std::string_view type{TextureType(info)};
const std::string image{Image(ctx, info, index)};
ctx.Add("STOREIM.{} {},{},{},{};", format, image, color, coord, type);
}
void EmitImageAtomicIAdd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "ADD.U32");
}
void EmitImageAtomicSMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarS32 value) {
ImageAtomic(ctx, inst, index, coord, value, "MIN.S32");
}
void EmitImageAtomicUMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "MIN.U32");
}
void EmitImageAtomicSMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarS32 value) {
ImageAtomic(ctx, inst, index, coord, value, "MAX.S32");
}
void EmitImageAtomicUMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "MAX.U32");
}
void EmitImageAtomicInc32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "IWRAP.U32");
}
void EmitImageAtomicDec32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "DWRAP.U32");
}
void EmitImageAtomicAnd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "AND.U32");
}
void EmitImageAtomicOr32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "OR.U32");
}
void EmitImageAtomicXor32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "XOR.U32");
}
void EmitImageAtomicExchange32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
Register coord, ScalarU32 value) {
ImageAtomic(ctx, inst, index, coord, value, "EXCH.U32");
}
void EmitBindlessImageSampleImplicitLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageSampleExplicitLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageSampleDrefImplicitLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageSampleDrefExplicitLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageGather(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageGatherDref(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageFetch(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageQueryDimensions(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageQueryLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageGradient(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageRead(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageWrite(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageSampleImplicitLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageSampleExplicitLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageSampleDrefImplicitLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageSampleDrefExplicitLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageGather(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageGatherDref(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageFetch(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageQueryDimensions(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageQueryLod(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageGradient(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageRead(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageWrite(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicIAdd32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicSMin32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicUMin32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicSMax32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicUMax32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicInc32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicDec32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicAnd32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicOr32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicXor32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBindlessImageAtomicExchange32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicIAdd32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicSMin32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicUMin32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicSMax32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicUMax32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicInc32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicDec32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicAnd32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicOr32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicXor32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
void EmitBoundImageAtomicExchange32(EmitContext&) {
throw LogicError("Unreachable instruction");
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,625 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "common/common_types.h"
#include "shader_recompiler/backend/glasm/reg_alloc.h"
namespace Shader::IR {
enum class Attribute : u64;
enum class Patch : u64;
class Inst;
class Value;
} // namespace Shader::IR
namespace Shader::Backend::GLASM {
class EmitContext;
// Microinstruction emitters
void EmitPhi(EmitContext& ctx, IR::Inst& inst);
void EmitVoid(EmitContext& ctx);
void EmitIdentity(EmitContext& ctx, IR::Inst& inst, const IR::Value& value);
void EmitConditionRef(EmitContext& ctx, IR::Inst& inst, const IR::Value& value);
void EmitReference(EmitContext&, const IR::Value& value);
void EmitPhiMove(EmitContext& ctx, const IR::Value& phi, const IR::Value& value);
void EmitJoin(EmitContext& ctx);
void EmitDemoteToHelperInvocation(EmitContext& ctx);
void EmitBarrier(EmitContext& ctx);
void EmitWorkgroupMemoryBarrier(EmitContext& ctx);
void EmitDeviceMemoryBarrier(EmitContext& ctx);
void EmitPrologue(EmitContext& ctx);
void EmitEpilogue(EmitContext& ctx);
void EmitEmitVertex(EmitContext& ctx, ScalarS32 stream);
void EmitEndPrimitive(EmitContext& ctx, const IR::Value& stream);
void EmitGetRegister(EmitContext& ctx);
void EmitSetRegister(EmitContext& ctx);
void EmitGetPred(EmitContext& ctx);
void EmitSetPred(EmitContext& ctx);
void EmitSetGotoVariable(EmitContext& ctx);
void EmitGetGotoVariable(EmitContext& ctx);
void EmitSetIndirectBranchVariable(EmitContext& ctx);
void EmitGetIndirectBranchVariable(EmitContext& ctx);
void EmitGetCbufU8(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset);
void EmitGetCbufS8(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset);
void EmitGetCbufU16(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset);
void EmitGetCbufS16(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset);
void EmitGetCbufU32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset);
void EmitGetCbufF32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset);
void EmitGetCbufU32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset);
void EmitGetAttribute(EmitContext& ctx, IR::Inst& inst, IR::Attribute attr, ScalarU32 vertex);
void EmitSetAttribute(EmitContext& ctx, IR::Attribute attr, ScalarF32 value, ScalarU32 vertex);
void EmitGetAttributeIndexed(EmitContext& ctx, IR::Inst& inst, ScalarS32 offset, ScalarU32 vertex);
void EmitSetAttributeIndexed(EmitContext& ctx, ScalarU32 offset, ScalarF32 value, ScalarU32 vertex);
void EmitGetPatch(EmitContext& ctx, IR::Inst& inst, IR::Patch patch);
void EmitSetPatch(EmitContext& ctx, IR::Patch patch, ScalarF32 value);
void EmitSetFragColor(EmitContext& ctx, u32 index, u32 component, ScalarF32 value);
void EmitSetSampleMask(EmitContext& ctx, ScalarS32 value);
void EmitSetFragDepth(EmitContext& ctx, ScalarF32 value);
void EmitGetZFlag(EmitContext& ctx);
void EmitGetSFlag(EmitContext& ctx);
void EmitGetCFlag(EmitContext& ctx);
void EmitGetOFlag(EmitContext& ctx);
void EmitSetZFlag(EmitContext& ctx);
void EmitSetSFlag(EmitContext& ctx);
void EmitSetCFlag(EmitContext& ctx);
void EmitSetOFlag(EmitContext& ctx);
void EmitWorkgroupId(EmitContext& ctx, IR::Inst& inst);
void EmitLocalInvocationId(EmitContext& ctx, IR::Inst& inst);
void EmitInvocationId(EmitContext& ctx, IR::Inst& inst);
void EmitSampleId(EmitContext& ctx, IR::Inst& inst);
void EmitIsHelperInvocation(EmitContext& ctx, IR::Inst& inst);
void EmitYDirection(EmitContext& ctx, IR::Inst& inst);
void EmitLoadLocal(EmitContext& ctx, IR::Inst& inst, ScalarU32 word_offset);
void EmitWriteLocal(EmitContext& ctx, ScalarU32 word_offset, ScalarU32 value);
void EmitUndefU1(EmitContext& ctx, IR::Inst& inst);
void EmitUndefU8(EmitContext& ctx, IR::Inst& inst);
void EmitUndefU16(EmitContext& ctx, IR::Inst& inst);
void EmitUndefU32(EmitContext& ctx, IR::Inst& inst);
void EmitUndefU64(EmitContext& ctx, IR::Inst& inst);
void EmitLoadGlobalU8(EmitContext& ctx, IR::Inst& inst, Register address);
void EmitLoadGlobalS8(EmitContext& ctx, IR::Inst& inst, Register address);
void EmitLoadGlobalU16(EmitContext& ctx, IR::Inst& inst, Register address);
void EmitLoadGlobalS16(EmitContext& ctx, IR::Inst& inst, Register address);
void EmitLoadGlobal32(EmitContext& ctx, IR::Inst& inst, Register address);
void EmitLoadGlobal64(EmitContext& ctx, IR::Inst& inst, Register address);
void EmitLoadGlobal128(EmitContext& ctx, IR::Inst& inst, Register address);
void EmitWriteGlobalU8(EmitContext& ctx, Register address, Register value);
void EmitWriteGlobalS8(EmitContext& ctx, Register address, Register value);
void EmitWriteGlobalU16(EmitContext& ctx, Register address, Register value);
void EmitWriteGlobalS16(EmitContext& ctx, Register address, Register value);
void EmitWriteGlobal32(EmitContext& ctx, Register address, ScalarU32 value);
void EmitWriteGlobal64(EmitContext& ctx, Register address, Register value);
void EmitWriteGlobal128(EmitContext& ctx, Register address, Register value);
void EmitLoadStorageU8(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset);
void EmitLoadStorageS8(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset);
void EmitLoadStorageU16(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset);
void EmitLoadStorageS16(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset);
void EmitLoadStorage32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset);
void EmitLoadStorage64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset);
void EmitLoadStorage128(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset);
void EmitWriteStorageU8(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarU32 value);
void EmitWriteStorageS8(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarS32 value);
void EmitWriteStorageU16(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarU32 value);
void EmitWriteStorageS16(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarS32 value);
void EmitWriteStorage32(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarU32 value);
void EmitWriteStorage64(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
Register value);
void EmitWriteStorage128(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
Register value);
void EmitLoadSharedU8(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset);
void EmitLoadSharedS8(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset);
void EmitLoadSharedU16(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset);
void EmitLoadSharedS16(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset);
void EmitLoadSharedU32(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset);
void EmitLoadSharedU64(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset);
void EmitLoadSharedU128(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset);
void EmitWriteSharedU8(EmitContext& ctx, ScalarU32 offset, ScalarU32 value);
void EmitWriteSharedU16(EmitContext& ctx, ScalarU32 offset, ScalarU32 value);
void EmitWriteSharedU32(EmitContext& ctx, ScalarU32 offset, ScalarU32 value);
void EmitWriteSharedU64(EmitContext& ctx, ScalarU32 offset, Register value);
void EmitWriteSharedU128(EmitContext& ctx, ScalarU32 offset, Register value);
void EmitCompositeConstructU32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2);
void EmitCompositeConstructU32x3(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2, const IR::Value& e3);
void EmitCompositeConstructU32x4(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2, const IR::Value& e3, const IR::Value& e4);
void EmitCompositeExtractU32x2(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index);
void EmitCompositeExtractU32x3(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index);
void EmitCompositeExtractU32x4(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index);
void EmitCompositeInsertU32x2(EmitContext& ctx, Register composite, ScalarU32 object, u32 index);
void EmitCompositeInsertU32x3(EmitContext& ctx, Register composite, ScalarU32 object, u32 index);
void EmitCompositeInsertU32x4(EmitContext& ctx, Register composite, ScalarU32 object, u32 index);
void EmitCompositeConstructF16x2(EmitContext& ctx, Register e1, Register e2);
void EmitCompositeConstructF16x3(EmitContext& ctx, Register e1, Register e2, Register e3);
void EmitCompositeConstructF16x4(EmitContext& ctx, Register e1, Register e2, Register e3,
Register e4);
void EmitCompositeExtractF16x2(EmitContext& ctx, Register composite, u32 index);
void EmitCompositeExtractF16x3(EmitContext& ctx, Register composite, u32 index);
void EmitCompositeExtractF16x4(EmitContext& ctx, Register composite, u32 index);
void EmitCompositeInsertF16x2(EmitContext& ctx, Register composite, Register object, u32 index);
void EmitCompositeInsertF16x3(EmitContext& ctx, Register composite, Register object, u32 index);
void EmitCompositeInsertF16x4(EmitContext& ctx, Register composite, Register object, u32 index);
void EmitCompositeConstructF32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2);
void EmitCompositeConstructF32x3(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2, const IR::Value& e3);
void EmitCompositeConstructF32x4(EmitContext& ctx, IR::Inst& inst, const IR::Value& e1,
const IR::Value& e2, const IR::Value& e3, const IR::Value& e4);
void EmitCompositeExtractF32x2(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index);
void EmitCompositeExtractF32x3(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index);
void EmitCompositeExtractF32x4(EmitContext& ctx, IR::Inst& inst, Register composite, u32 index);
void EmitCompositeInsertF32x2(EmitContext& ctx, IR::Inst& inst, Register composite,
ScalarF32 object, u32 index);
void EmitCompositeInsertF32x3(EmitContext& ctx, IR::Inst& inst, Register composite,
ScalarF32 object, u32 index);
void EmitCompositeInsertF32x4(EmitContext& ctx, IR::Inst& inst, Register composite,
ScalarF32 object, u32 index);
void EmitCompositeConstructF64x2(EmitContext& ctx);
void EmitCompositeConstructF64x3(EmitContext& ctx);
void EmitCompositeConstructF64x4(EmitContext& ctx);
void EmitCompositeExtractF64x2(EmitContext& ctx);
void EmitCompositeExtractF64x3(EmitContext& ctx);
void EmitCompositeExtractF64x4(EmitContext& ctx);
void EmitCompositeInsertF64x2(EmitContext& ctx, Register composite, Register object, u32 index);
void EmitCompositeInsertF64x3(EmitContext& ctx, Register composite, Register object, u32 index);
void EmitCompositeInsertF64x4(EmitContext& ctx, Register composite, Register object, u32 index);
void EmitSelectU1(EmitContext& ctx, IR::Inst& inst, ScalarS32 cond, ScalarS32 true_value,
ScalarS32 false_value);
void EmitSelectU8(EmitContext& ctx, ScalarS32 cond, ScalarS32 true_value, ScalarS32 false_value);
void EmitSelectU16(EmitContext& ctx, ScalarS32 cond, ScalarS32 true_value, ScalarS32 false_value);
void EmitSelectU32(EmitContext& ctx, IR::Inst& inst, ScalarS32 cond, ScalarS32 true_value,
ScalarS32 false_value);
void EmitSelectU64(EmitContext& ctx, IR::Inst& inst, ScalarS32 cond, Register true_value,
Register false_value);
void EmitSelectF16(EmitContext& ctx, ScalarS32 cond, Register true_value, Register false_value);
void EmitSelectF32(EmitContext& ctx, IR::Inst& inst, ScalarS32 cond, ScalarS32 true_value,
ScalarS32 false_value);
void EmitSelectF64(EmitContext& ctx, ScalarS32 cond, Register true_value, Register false_value);
void EmitBitCastU16F16(EmitContext& ctx, IR::Inst& inst, const IR::Value& value);
void EmitBitCastU32F32(EmitContext& ctx, IR::Inst& inst, const IR::Value& value);
void EmitBitCastU64F64(EmitContext& ctx, IR::Inst& inst, const IR::Value& value);
void EmitBitCastF16U16(EmitContext& ctx, IR::Inst& inst, const IR::Value& value);
void EmitBitCastF32U32(EmitContext& ctx, IR::Inst& inst, const IR::Value& value);
void EmitBitCastF64U64(EmitContext& ctx, IR::Inst& inst, const IR::Value& value);
void EmitPackUint2x32(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitUnpackUint2x32(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitPackFloat2x16(EmitContext& ctx, Register value);
void EmitUnpackFloat2x16(EmitContext& ctx, Register value);
void EmitPackHalf2x16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitUnpackHalf2x16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitPackDouble2x32(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitUnpackDouble2x32(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitGetZeroFromOp(EmitContext& ctx);
void EmitGetSignFromOp(EmitContext& ctx);
void EmitGetCarryFromOp(EmitContext& ctx);
void EmitGetOverflowFromOp(EmitContext& ctx);
void EmitGetSparseFromOp(EmitContext& ctx);
void EmitGetInBoundsFromOp(EmitContext& ctx);
void EmitFPAbs16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitFPAbs32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPAbs64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitFPAdd16(EmitContext& ctx, IR::Inst& inst, Register a, Register b);
void EmitFPAdd32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b);
void EmitFPAdd64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b);
void EmitFPFma16(EmitContext& ctx, IR::Inst& inst, Register a, Register b, Register c);
void EmitFPFma32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b, ScalarF32 c);
void EmitFPFma64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b, ScalarF64 c);
void EmitFPMax32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b);
void EmitFPMax64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b);
void EmitFPMin32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b);
void EmitFPMin64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b);
void EmitFPMul16(EmitContext& ctx, IR::Inst& inst, Register a, Register b);
void EmitFPMul32(EmitContext& ctx, IR::Inst& inst, ScalarF32 a, ScalarF32 b);
void EmitFPMul64(EmitContext& ctx, IR::Inst& inst, ScalarF64 a, ScalarF64 b);
void EmitFPNeg16(EmitContext& ctx, Register value);
void EmitFPNeg32(EmitContext& ctx, IR::Inst& inst, ScalarRegister value);
void EmitFPNeg64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitFPSin(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPCos(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPExp2(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPLog2(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPRecip32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPRecip64(EmitContext& ctx, Register value);
void EmitFPRecipSqrt32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPRecipSqrt64(EmitContext& ctx, Register value);
void EmitFPSqrt(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPSaturate16(EmitContext& ctx, Register value);
void EmitFPSaturate32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPSaturate64(EmitContext& ctx, Register value);
void EmitFPClamp16(EmitContext& ctx, Register value, Register min_value, Register max_value);
void EmitFPClamp32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value, ScalarF32 min_value,
ScalarF32 max_value);
void EmitFPClamp64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value, ScalarF64 min_value,
ScalarF64 max_value);
void EmitFPRoundEven16(EmitContext& ctx, Register value);
void EmitFPRoundEven32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPRoundEven64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitFPFloor16(EmitContext& ctx, Register value);
void EmitFPFloor32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPFloor64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitFPCeil16(EmitContext& ctx, Register value);
void EmitFPCeil32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPCeil64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitFPTrunc16(EmitContext& ctx, Register value);
void EmitFPTrunc32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPTrunc64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitFPOrdEqual16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPOrdEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPOrdEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPUnordEqual16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPUnordEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPUnordEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPOrdNotEqual16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPOrdNotEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPOrdNotEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPUnordNotEqual16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPUnordNotEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPUnordNotEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPOrdLessThan16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPOrdLessThan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPOrdLessThan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPUnordLessThan16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPUnordLessThan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPUnordLessThan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPOrdGreaterThan16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPOrdGreaterThan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPOrdGreaterThan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPUnordGreaterThan16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPUnordGreaterThan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPUnordGreaterThan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPOrdLessThanEqual16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPOrdLessThanEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPOrdLessThanEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPUnordLessThanEqual16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPUnordLessThanEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPUnordLessThanEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPOrdGreaterThanEqual16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPOrdGreaterThanEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPOrdGreaterThanEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPUnordGreaterThanEqual16(EmitContext& ctx, Register lhs, Register rhs);
void EmitFPUnordGreaterThanEqual32(EmitContext& ctx, IR::Inst& inst, ScalarF32 lhs, ScalarF32 rhs);
void EmitFPUnordGreaterThanEqual64(EmitContext& ctx, IR::Inst& inst, ScalarF64 lhs, ScalarF64 rhs);
void EmitFPIsNan16(EmitContext& ctx, Register value);
void EmitFPIsNan32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitFPIsNan64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitIAdd32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitIAdd64(EmitContext& ctx, IR::Inst& inst, Register a, Register b);
void EmitISub32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitISub64(EmitContext& ctx, IR::Inst& inst, Register a, Register b);
void EmitIMul32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitINeg32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitINeg64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitIAbs32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitShiftLeftLogical32(EmitContext& ctx, IR::Inst& inst, ScalarU32 base, ScalarU32 shift);
void EmitShiftLeftLogical64(EmitContext& ctx, IR::Inst& inst, ScalarRegister base, ScalarU32 shift);
void EmitShiftRightLogical32(EmitContext& ctx, IR::Inst& inst, ScalarU32 base, ScalarU32 shift);
void EmitShiftRightLogical64(EmitContext& ctx, IR::Inst& inst, ScalarRegister base,
ScalarU32 shift);
void EmitShiftRightArithmetic32(EmitContext& ctx, IR::Inst& inst, ScalarS32 base, ScalarS32 shift);
void EmitShiftRightArithmetic64(EmitContext& ctx, IR::Inst& inst, ScalarRegister base,
ScalarS32 shift);
void EmitBitwiseAnd32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitBitwiseOr32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitBitwiseXor32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitBitFieldInsert(EmitContext& ctx, IR::Inst& inst, ScalarS32 base, ScalarS32 insert,
ScalarS32 offset, ScalarS32 count);
void EmitBitFieldSExtract(EmitContext& ctx, IR::Inst& inst, ScalarS32 base, ScalarS32 offset,
ScalarS32 count);
void EmitBitFieldUExtract(EmitContext& ctx, IR::Inst& inst, ScalarU32 base, ScalarU32 offset,
ScalarU32 count);
void EmitBitReverse32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitBitCount32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitBitwiseNot32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitFindSMsb32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitFindUMsb32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value);
void EmitSMin32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitUMin32(EmitContext& ctx, IR::Inst& inst, ScalarU32 a, ScalarU32 b);
void EmitSMax32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitUMax32(EmitContext& ctx, IR::Inst& inst, ScalarU32 a, ScalarU32 b);
void EmitSClamp32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value, ScalarS32 min, ScalarS32 max);
void EmitUClamp32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 min, ScalarU32 max);
void EmitSLessThan(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs);
void EmitULessThan(EmitContext& ctx, IR::Inst& inst, ScalarU32 lhs, ScalarU32 rhs);
void EmitIEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs);
void EmitSLessThanEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs);
void EmitULessThanEqual(EmitContext& ctx, IR::Inst& inst, ScalarU32 lhs, ScalarU32 rhs);
void EmitSGreaterThan(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs);
void EmitUGreaterThan(EmitContext& ctx, IR::Inst& inst, ScalarU32 lhs, ScalarU32 rhs);
void EmitINotEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs);
void EmitSGreaterThanEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs);
void EmitUGreaterThanEqual(EmitContext& ctx, IR::Inst& inst, ScalarU32 lhs, ScalarU32 rhs);
void EmitSharedAtomicIAdd32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicSMin32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarS32 value);
void EmitSharedAtomicUMin32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicSMax32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarS32 value);
void EmitSharedAtomicUMax32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicInc32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicDec32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicAnd32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicOr32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicXor32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicExchange32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value);
void EmitSharedAtomicExchange64(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
Register value);
void EmitStorageAtomicIAdd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicSMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarS32 value);
void EmitStorageAtomicUMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicSMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarS32 value);
void EmitStorageAtomicUMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicInc32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicDec32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicAnd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicOr32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicXor32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicExchange32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value);
void EmitStorageAtomicIAdd64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicSMin64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicUMin64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicSMax64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicUMax64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicAnd64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicOr64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicXor64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicExchange64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicAddF32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarF32 value);
void EmitStorageAtomicAddF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicAddF32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicMinF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicMinF32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicMaxF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitStorageAtomicMaxF32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value);
void EmitGlobalAtomicIAdd32(EmitContext& ctx);
void EmitGlobalAtomicSMin32(EmitContext& ctx);
void EmitGlobalAtomicUMin32(EmitContext& ctx);
void EmitGlobalAtomicSMax32(EmitContext& ctx);
void EmitGlobalAtomicUMax32(EmitContext& ctx);
void EmitGlobalAtomicInc32(EmitContext& ctx);
void EmitGlobalAtomicDec32(EmitContext& ctx);
void EmitGlobalAtomicAnd32(EmitContext& ctx);
void EmitGlobalAtomicOr32(EmitContext& ctx);
void EmitGlobalAtomicXor32(EmitContext& ctx);
void EmitGlobalAtomicExchange32(EmitContext& ctx);
void EmitGlobalAtomicIAdd64(EmitContext& ctx);
void EmitGlobalAtomicSMin64(EmitContext& ctx);
void EmitGlobalAtomicUMin64(EmitContext& ctx);
void EmitGlobalAtomicSMax64(EmitContext& ctx);
void EmitGlobalAtomicUMax64(EmitContext& ctx);
void EmitGlobalAtomicInc64(EmitContext& ctx);
void EmitGlobalAtomicDec64(EmitContext& ctx);
void EmitGlobalAtomicAnd64(EmitContext& ctx);
void EmitGlobalAtomicOr64(EmitContext& ctx);
void EmitGlobalAtomicXor64(EmitContext& ctx);
void EmitGlobalAtomicExchange64(EmitContext& ctx);
void EmitGlobalAtomicAddF32(EmitContext& ctx);
void EmitGlobalAtomicAddF16x2(EmitContext& ctx);
void EmitGlobalAtomicAddF32x2(EmitContext& ctx);
void EmitGlobalAtomicMinF16x2(EmitContext& ctx);
void EmitGlobalAtomicMinF32x2(EmitContext& ctx);
void EmitGlobalAtomicMaxF16x2(EmitContext& ctx);
void EmitGlobalAtomicMaxF32x2(EmitContext& ctx);
void EmitLogicalOr(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitLogicalAnd(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitLogicalXor(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b);
void EmitLogicalNot(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitConvertS16F16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertS16F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitConvertS16F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitConvertS32F16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertS32F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitConvertS32F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitConvertS64F16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertS64F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitConvertS64F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitConvertU16F16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertU16F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitConvertU16F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitConvertU32F16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertU32F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitConvertU32F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitConvertU64F16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertU64F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitConvertU64F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitConvertU64U32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value);
void EmitConvertU32U64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF16F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitConvertF32F16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF32F64(EmitContext& ctx, IR::Inst& inst, ScalarF64 value);
void EmitConvertF64F32(EmitContext& ctx, IR::Inst& inst, ScalarF32 value);
void EmitConvertF16S8(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF16S16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF16S32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitConvertF16S64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF16U8(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF16U16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF16U32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value);
void EmitConvertF16U64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF32S8(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF32S16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF32S32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitConvertF32S64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF32U8(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF32U16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF32U32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value);
void EmitConvertF32U64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF64S8(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF64S16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF64S32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value);
void EmitConvertF64S64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF64U8(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF64U16(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitConvertF64U32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value);
void EmitConvertF64U64(EmitContext& ctx, IR::Inst& inst, Register value);
void EmitBindlessImageSampleImplicitLod(EmitContext&);
void EmitBindlessImageSampleExplicitLod(EmitContext&);
void EmitBindlessImageSampleDrefImplicitLod(EmitContext&);
void EmitBindlessImageSampleDrefExplicitLod(EmitContext&);
void EmitBindlessImageGather(EmitContext&);
void EmitBindlessImageGatherDref(EmitContext&);
void EmitBindlessImageFetch(EmitContext&);
void EmitBindlessImageQueryDimensions(EmitContext&);
void EmitBindlessImageQueryLod(EmitContext&);
void EmitBindlessImageGradient(EmitContext&);
void EmitBindlessImageRead(EmitContext&);
void EmitBindlessImageWrite(EmitContext&);
void EmitBoundImageSampleImplicitLod(EmitContext&);
void EmitBoundImageSampleExplicitLod(EmitContext&);
void EmitBoundImageSampleDrefImplicitLod(EmitContext&);
void EmitBoundImageSampleDrefExplicitLod(EmitContext&);
void EmitBoundImageGather(EmitContext&);
void EmitBoundImageGatherDref(EmitContext&);
void EmitBoundImageFetch(EmitContext&);
void EmitBoundImageQueryDimensions(EmitContext&);
void EmitBoundImageQueryLod(EmitContext&);
void EmitBoundImageGradient(EmitContext&);
void EmitBoundImageRead(EmitContext&);
void EmitBoundImageWrite(EmitContext&);
void EmitImageSampleImplicitLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, Register bias_lc, const IR::Value& offset);
void EmitImageSampleExplicitLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, ScalarF32 lod, const IR::Value& offset);
void EmitImageSampleDrefImplicitLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& dref,
const IR::Value& bias_lc, const IR::Value& offset);
void EmitImageSampleDrefExplicitLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& dref,
const IR::Value& lod, const IR::Value& offset);
void EmitImageGather(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& offset, const IR::Value& offset2);
void EmitImageGatherDref(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& offset, const IR::Value& offset2,
const IR::Value& dref);
void EmitImageFetch(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& offset, ScalarS32 lod, ScalarS32 ms);
void EmitImageQueryDimensions(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
ScalarS32 lod);
void EmitImageQueryLod(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord);
void EmitImageGradient(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
const IR::Value& coord, const IR::Value& derivatives,
const IR::Value& offset, const IR::Value& lod_clamp);
void EmitImageRead(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord);
void EmitImageWrite(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
Register color);
void EmitBindlessImageAtomicIAdd32(EmitContext&);
void EmitBindlessImageAtomicSMin32(EmitContext&);
void EmitBindlessImageAtomicUMin32(EmitContext&);
void EmitBindlessImageAtomicSMax32(EmitContext&);
void EmitBindlessImageAtomicUMax32(EmitContext&);
void EmitBindlessImageAtomicInc32(EmitContext&);
void EmitBindlessImageAtomicDec32(EmitContext&);
void EmitBindlessImageAtomicAnd32(EmitContext&);
void EmitBindlessImageAtomicOr32(EmitContext&);
void EmitBindlessImageAtomicXor32(EmitContext&);
void EmitBindlessImageAtomicExchange32(EmitContext&);
void EmitBoundImageAtomicIAdd32(EmitContext&);
void EmitBoundImageAtomicSMin32(EmitContext&);
void EmitBoundImageAtomicUMin32(EmitContext&);
void EmitBoundImageAtomicSMax32(EmitContext&);
void EmitBoundImageAtomicUMax32(EmitContext&);
void EmitBoundImageAtomicInc32(EmitContext&);
void EmitBoundImageAtomicDec32(EmitContext&);
void EmitBoundImageAtomicAnd32(EmitContext&);
void EmitBoundImageAtomicOr32(EmitContext&);
void EmitBoundImageAtomicXor32(EmitContext&);
void EmitBoundImageAtomicExchange32(EmitContext&);
void EmitImageAtomicIAdd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value);
void EmitImageAtomicSMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarS32 value);
void EmitImageAtomicUMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value);
void EmitImageAtomicSMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarS32 value);
void EmitImageAtomicUMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value);
void EmitImageAtomicInc32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value);
void EmitImageAtomicDec32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value);
void EmitImageAtomicAnd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value);
void EmitImageAtomicOr32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value);
void EmitImageAtomicXor32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index, Register coord,
ScalarU32 value);
void EmitImageAtomicExchange32(EmitContext& ctx, IR::Inst& inst, const IR::Value& index,
Register coord, ScalarU32 value);
void EmitLaneId(EmitContext& ctx, IR::Inst& inst);
void EmitVoteAll(EmitContext& ctx, IR::Inst& inst, ScalarS32 pred);
void EmitVoteAny(EmitContext& ctx, IR::Inst& inst, ScalarS32 pred);
void EmitVoteEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 pred);
void EmitSubgroupBallot(EmitContext& ctx, IR::Inst& inst, ScalarS32 pred);
void EmitSubgroupEqMask(EmitContext& ctx, IR::Inst& inst);
void EmitSubgroupLtMask(EmitContext& ctx, IR::Inst& inst);
void EmitSubgroupLeMask(EmitContext& ctx, IR::Inst& inst);
void EmitSubgroupGtMask(EmitContext& ctx, IR::Inst& inst);
void EmitSubgroupGeMask(EmitContext& ctx, IR::Inst& inst);
void EmitShuffleIndex(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask);
void EmitShuffleUp(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask);
void EmitShuffleDown(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask);
void EmitShuffleButterfly(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask);
void EmitFSwizzleAdd(EmitContext& ctx, IR::Inst& inst, ScalarF32 op_a, ScalarF32 op_b,
ScalarU32 swizzle);
void EmitDPdxFine(EmitContext& ctx, IR::Inst& inst, ScalarF32 op_a);
void EmitDPdyFine(EmitContext& ctx, IR::Inst& inst, ScalarF32 op_a);
void EmitDPdxCoarse(EmitContext& ctx, IR::Inst& inst, ScalarF32 op_a);
void EmitDPdyCoarse(EmitContext& ctx, IR::Inst& inst, ScalarF32 op_a);
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,294 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
namespace {
void BitwiseLogicalOp(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b,
std::string_view lop) {
const auto zero = inst.GetAssociatedPseudoOperation(IR::Opcode::GetZeroFromOp);
const auto sign = inst.GetAssociatedPseudoOperation(IR::Opcode::GetSignFromOp);
if (zero) {
zero->Invalidate();
}
if (sign) {
sign->Invalidate();
}
if (zero || sign) {
ctx.reg_alloc.InvalidateConditionCodes();
}
const auto ret{ctx.reg_alloc.Define(inst)};
ctx.Add("{}.S {}.x,{},{};", lop, ret, a, b);
if (zero) {
ctx.Add("SEQ.S {},{},0;", *zero, ret);
}
if (sign) {
ctx.Add("SLT.S {},{},0;", *sign, ret);
}
}
} // Anonymous namespace
void EmitIAdd32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
const std::array flags{
inst.GetAssociatedPseudoOperation(IR::Opcode::GetZeroFromOp),
inst.GetAssociatedPseudoOperation(IR::Opcode::GetSignFromOp),
inst.GetAssociatedPseudoOperation(IR::Opcode::GetCarryFromOp),
inst.GetAssociatedPseudoOperation(IR::Opcode::GetOverflowFromOp),
};
for (IR::Inst* const flag_inst : flags) {
if (flag_inst) {
flag_inst->Invalidate();
}
}
const bool cc{inst.HasAssociatedPseudoOperation()};
const std::string_view cc_mod{cc ? ".CC" : ""};
if (cc) {
ctx.reg_alloc.InvalidateConditionCodes();
}
const auto ret{ctx.reg_alloc.Define(inst)};
ctx.Add("ADD.S{} {}.x,{},{};", cc_mod, ret, a, b);
if (!cc) {
return;
}
static constexpr std::array<std::string_view, 4> masks{"", "SF", "CF", "OF"};
for (size_t flag_index = 0; flag_index < flags.size(); ++flag_index) {
if (!flags[flag_index]) {
continue;
}
const auto flag_ret{ctx.reg_alloc.Define(*flags[flag_index])};
if (flag_index == 0) {
ctx.Add("SEQ.S {}.x,{}.x,0;", flag_ret, ret);
} else {
// We could use conditional execution here, but it's broken on Nvidia's compiler
ctx.Add("IF {}.x;"
"MOV.S {}.x,-1;"
"ELSE;"
"MOV.S {}.x,0;"
"ENDIF;",
masks[flag_index], flag_ret, flag_ret);
}
}
}
void EmitIAdd64(EmitContext& ctx, IR::Inst& inst, Register a, Register b) {
ctx.LongAdd("ADD.S64 {}.x,{}.x,{}.x;", inst, a, b);
}
void EmitISub32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
ctx.Add("SUB.S {}.x,{},{};", inst, a, b);
}
void EmitISub64(EmitContext& ctx, IR::Inst& inst, Register a, Register b) {
ctx.LongAdd("SUB.S64 {}.x,{}.x,{}.x;", inst, a, b);
}
void EmitIMul32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
ctx.Add("MUL.S {}.x,{},{};", inst, a, b);
}
void EmitINeg32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
if (value.type != Type::Register && static_cast<s32>(value.imm_u32) < 0) {
ctx.Add("MOV.S {},{};", inst, -static_cast<s32>(value.imm_u32));
} else {
ctx.Add("MOV.S {},-{};", inst, value);
}
}
void EmitINeg64(EmitContext& ctx, IR::Inst& inst, Register value) {
ctx.LongAdd("MOV.S64 {},-{};", inst, value);
}
void EmitIAbs32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
ctx.Add("ABS.S {},{};", inst, value);
}
void EmitShiftLeftLogical32(EmitContext& ctx, IR::Inst& inst, ScalarU32 base, ScalarU32 shift) {
ctx.Add("SHL.U {}.x,{},{};", inst, base, shift);
}
void EmitShiftLeftLogical64(EmitContext& ctx, IR::Inst& inst, ScalarRegister base,
ScalarU32 shift) {
ctx.LongAdd("SHL.U64 {}.x,{},{};", inst, base, shift);
}
void EmitShiftRightLogical32(EmitContext& ctx, IR::Inst& inst, ScalarU32 base, ScalarU32 shift) {
ctx.Add("SHR.U {}.x,{},{};", inst, base, shift);
}
void EmitShiftRightLogical64(EmitContext& ctx, IR::Inst& inst, ScalarRegister base,
ScalarU32 shift) {
ctx.LongAdd("SHR.U64 {}.x,{},{};", inst, base, shift);
}
void EmitShiftRightArithmetic32(EmitContext& ctx, IR::Inst& inst, ScalarS32 base, ScalarS32 shift) {
ctx.Add("SHR.S {}.x,{},{};", inst, base, shift);
}
void EmitShiftRightArithmetic64(EmitContext& ctx, IR::Inst& inst, ScalarRegister base,
ScalarS32 shift) {
ctx.LongAdd("SHR.S64 {}.x,{},{};", inst, base, shift);
}
void EmitBitwiseAnd32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
BitwiseLogicalOp(ctx, inst, a, b, "AND");
}
void EmitBitwiseOr32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
BitwiseLogicalOp(ctx, inst, a, b, "OR");
}
void EmitBitwiseXor32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
BitwiseLogicalOp(ctx, inst, a, b, "XOR");
}
void EmitBitFieldInsert(EmitContext& ctx, IR::Inst& inst, ScalarS32 base, ScalarS32 insert,
ScalarS32 offset, ScalarS32 count) {
const Register ret{ctx.reg_alloc.Define(inst)};
if (count.type != Type::Register && offset.type != Type::Register) {
ctx.Add("BFI.S {},{{{},{},0,0}},{},{};", ret, count, offset, insert, base);
} else {
ctx.Add("MOV.S RC.x,{};"
"MOV.S RC.y,{};"
"BFI.S {},RC,{},{};",
count, offset, ret, insert, base);
}
}
void EmitBitFieldSExtract(EmitContext& ctx, IR::Inst& inst, ScalarS32 base, ScalarS32 offset,
ScalarS32 count) {
const Register ret{ctx.reg_alloc.Define(inst)};
if (count.type != Type::Register && offset.type != Type::Register) {
ctx.Add("BFE.S {},{{{},{},0,0}},{};", ret, count, offset, base);
} else {
ctx.Add("MOV.S RC.x,{};"
"MOV.S RC.y,{};"
"BFE.S {},RC,{};",
count, offset, ret, base);
}
}
void EmitBitFieldUExtract(EmitContext& ctx, IR::Inst& inst, ScalarU32 base, ScalarU32 offset,
ScalarU32 count) {
const auto zero = inst.GetAssociatedPseudoOperation(IR::Opcode::GetZeroFromOp);
const auto sign = inst.GetAssociatedPseudoOperation(IR::Opcode::GetSignFromOp);
if (zero) {
zero->Invalidate();
}
if (sign) {
sign->Invalidate();
}
if (zero || sign) {
ctx.reg_alloc.InvalidateConditionCodes();
}
const Register ret{ctx.reg_alloc.Define(inst)};
if (count.type != Type::Register && offset.type != Type::Register) {
ctx.Add("BFE.U {},{{{},{},0,0}},{};", ret, count, offset, base);
} else {
ctx.Add("MOV.U RC.x,{};"
"MOV.U RC.y,{};"
"BFE.U {},RC,{};",
count, offset, ret, base);
}
if (zero) {
ctx.Add("SEQ.S {},{},0;", *zero, ret);
}
if (sign) {
ctx.Add("SLT.S {},{},0;", *sign, ret);
}
}
void EmitBitReverse32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
ctx.Add("BFR {},{};", inst, value);
}
void EmitBitCount32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
ctx.Add("BTC {},{};", inst, value);
}
void EmitBitwiseNot32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
ctx.Add("NOT.S {},{};", inst, value);
}
void EmitFindSMsb32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
ctx.Add("BTFM.S {},{};", inst, value);
}
void EmitFindUMsb32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value) {
ctx.Add("BTFM.U {},{};", inst, value);
}
void EmitSMin32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
ctx.Add("MIN.S {},{},{};", inst, a, b);
}
void EmitUMin32(EmitContext& ctx, IR::Inst& inst, ScalarU32 a, ScalarU32 b) {
ctx.Add("MIN.U {},{},{};", inst, a, b);
}
void EmitSMax32(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
ctx.Add("MAX.S {},{},{};", inst, a, b);
}
void EmitUMax32(EmitContext& ctx, IR::Inst& inst, ScalarU32 a, ScalarU32 b) {
ctx.Add("MAX.U {},{},{};", inst, a, b);
}
void EmitSClamp32(EmitContext& ctx, IR::Inst& inst, ScalarS32 value, ScalarS32 min, ScalarS32 max) {
const Register ret{ctx.reg_alloc.Define(inst)};
ctx.Add("MIN.S RC.x,{},{};"
"MAX.S {}.x,RC.x,{};",
max, value, ret, min);
}
void EmitUClamp32(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 min, ScalarU32 max) {
const Register ret{ctx.reg_alloc.Define(inst)};
ctx.Add("MIN.U RC.x,{},{};"
"MAX.U {}.x,RC.x,{};",
max, value, ret, min);
}
void EmitSLessThan(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs) {
ctx.Add("SLT.S {}.x,{},{};", inst, lhs, rhs);
}
void EmitULessThan(EmitContext& ctx, IR::Inst& inst, ScalarU32 lhs, ScalarU32 rhs) {
ctx.Add("SLT.U {}.x,{},{};", inst, lhs, rhs);
}
void EmitIEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs) {
ctx.Add("SEQ.S {}.x,{},{};", inst, lhs, rhs);
}
void EmitSLessThanEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs) {
ctx.Add("SLE.S {}.x,{},{};", inst, lhs, rhs);
}
void EmitULessThanEqual(EmitContext& ctx, IR::Inst& inst, ScalarU32 lhs, ScalarU32 rhs) {
ctx.Add("SLE.U {}.x,{},{};", inst, lhs, rhs);
}
void EmitSGreaterThan(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs) {
ctx.Add("SGT.S {}.x,{},{};", inst, lhs, rhs);
}
void EmitUGreaterThan(EmitContext& ctx, IR::Inst& inst, ScalarU32 lhs, ScalarU32 rhs) {
ctx.Add("SGT.U {}.x,{},{};", inst, lhs, rhs);
}
void EmitINotEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs) {
ctx.Add("SNE.U {}.x,{},{};", inst, lhs, rhs);
}
void EmitSGreaterThanEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 lhs, ScalarS32 rhs) {
ctx.Add("SGE.S {}.x,{},{};", inst, lhs, rhs);
}
void EmitUGreaterThanEqual(EmitContext& ctx, IR::Inst& inst, ScalarU32 lhs, ScalarU32 rhs) {
ctx.Add("SGE.U {}.x,{},{};", inst, lhs, rhs);
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,568 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <string_view>
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/program.h"
#include "shader_recompiler/frontend/ir/value.h"
#include "shader_recompiler/runtime_info.h"
namespace Shader::Backend::GLASM {
namespace {
void StorageOp(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
std::string_view then_expr, std::string_view else_expr = {}) {
// Operate on bindless SSBO, call the expression with bounds checking
// address = c[binding].xy
// length = c[binding].z
const u32 sb_binding{binding.U32()};
ctx.Add("PK64.U DC,c[{}];" // pointer = address
"CVT.U64.U32 DC.z,{};" // offset = uint64_t(offset)
"ADD.U64 DC.x,DC.x,DC.z;" // pointer += offset
"SLT.U.CC RC.x,{},c[{}].z;", // cc = offset < length
sb_binding, offset, offset, sb_binding);
if (else_expr.empty()) {
ctx.Add("IF NE.x;{}ENDIF;", then_expr);
} else {
ctx.Add("IF NE.x;{}ELSE;{}ENDIF;", then_expr, else_expr);
}
}
void GlobalStorageOp(EmitContext& ctx, Register address, bool pointer_based, std::string_view expr,
std::string_view else_expr = {}) {
const size_t num_buffers{ctx.info.storage_buffers_descriptors.size()};
for (size_t index = 0; index < num_buffers; ++index) {
if (!ctx.info.nvn_buffer_used[index]) {
continue;
}
const auto& ssbo{ctx.info.storage_buffers_descriptors[index]};
ctx.Add("LDC.U64 DC.x,c{}[{}];" // ssbo_addr
"LDC.U32 RC.x,c{}[{}];" // ssbo_size_u32
"CVT.U64.U32 DC.y,RC.x;" // ssbo_size = ssbo_size_u32
"ADD.U64 DC.y,DC.y,DC.x;" // ssbo_end = ssbo_addr + ssbo_size
"SGE.U64 RC.x,{}.x,DC.x;" // a = input_addr >= ssbo_addr ? -1 : 0
"SLT.U64 RC.y,{}.x,DC.y;" // b = input_addr < ssbo_end ? -1 : 0
"AND.U.CC RC.x,RC.x,RC.y;" // cond = a && b
"IF NE.x;" // if cond
"SUB.U64 DC.x,{}.x,DC.x;", // offset = input_addr - ssbo_addr
ssbo.cbuf_index, ssbo.cbuf_offset, ssbo.cbuf_index, ssbo.cbuf_offset + 8, address,
address, address);
if (pointer_based) {
ctx.Add("PK64.U DC.y,c[{}];" // host_ssbo = cbuf
"ADD.U64 DC.x,DC.x,DC.y;" // host_addr = host_ssbo + offset
"{}"
"ELSE;",
index, expr);
} else {
ctx.Add("CVT.U32.U64 RC.x,DC.x;"
"{},ssbo{}[RC.x];"
"ELSE;",
expr, index);
}
}
if (!else_expr.empty()) {
ctx.Add("{}", else_expr);
}
const size_t num_used_buffers{ctx.info.nvn_buffer_used.count()};
for (size_t index = 0; index < num_used_buffers; ++index) {
ctx.Add("ENDIF;");
}
}
template <typename ValueType>
void Write(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset, ValueType value,
std::string_view size) {
if (ctx.runtime_info.glasm_use_storage_buffers) {
ctx.Add("STB.{} {},ssbo{}[{}];", size, value, binding.U32(), offset);
} else {
StorageOp(ctx, binding, offset, fmt::format("STORE.{} {},DC.x;", size, value));
}
}
void Load(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset,
std::string_view size) {
const Register ret{ctx.reg_alloc.Define(inst)};
if (ctx.runtime_info.glasm_use_storage_buffers) {
ctx.Add("LDB.{} {},ssbo{}[{}];", size, ret, binding.U32(), offset);
} else {
StorageOp(ctx, binding, offset, fmt::format("LOAD.{} {},DC.x;", size, ret),
fmt::format("MOV.U {},{{0,0,0,0}};", ret));
}
}
template <typename ValueType>
void GlobalWrite(EmitContext& ctx, Register address, ValueType value, std::string_view size) {
if (ctx.runtime_info.glasm_use_storage_buffers) {
GlobalStorageOp(ctx, address, false, fmt::format("STB.{} {}", size, value));
} else {
GlobalStorageOp(ctx, address, true, fmt::format("STORE.{} {},DC.x;", size, value));
}
}
void GlobalLoad(EmitContext& ctx, IR::Inst& inst, Register address, std::string_view size) {
const Register ret{ctx.reg_alloc.Define(inst)};
if (ctx.runtime_info.glasm_use_storage_buffers) {
GlobalStorageOp(ctx, address, false, fmt::format("LDB.{} {}", size, ret));
} else {
GlobalStorageOp(ctx, address, true, fmt::format("LOAD.{} {},DC.x;", size, ret),
fmt::format("MOV.S {},0;", ret));
}
}
template <typename ValueType>
void Atom(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding, ScalarU32 offset,
ValueType value, std::string_view operation, std::string_view size) {
const Register ret{ctx.reg_alloc.Define(inst)};
if (ctx.runtime_info.glasm_use_storage_buffers) {
ctx.Add("ATOMB.{}.{} {},{},ssbo{}[{}];", operation, size, ret, value, binding.U32(),
offset);
} else {
StorageOp(ctx, binding, offset,
fmt::format("ATOM.{}.{} {},{},DC.x;", operation, size, ret, value));
}
}
} // Anonymous namespace
void EmitLoadGlobalU8(EmitContext& ctx, IR::Inst& inst, Register address) {
GlobalLoad(ctx, inst, address, "U8");
}
void EmitLoadGlobalS8(EmitContext& ctx, IR::Inst& inst, Register address) {
GlobalLoad(ctx, inst, address, "S8");
}
void EmitLoadGlobalU16(EmitContext& ctx, IR::Inst& inst, Register address) {
GlobalLoad(ctx, inst, address, "U16");
}
void EmitLoadGlobalS16(EmitContext& ctx, IR::Inst& inst, Register address) {
GlobalLoad(ctx, inst, address, "S16");
}
void EmitLoadGlobal32(EmitContext& ctx, IR::Inst& inst, Register address) {
GlobalLoad(ctx, inst, address, "U32");
}
void EmitLoadGlobal64(EmitContext& ctx, IR::Inst& inst, Register address) {
GlobalLoad(ctx, inst, address, "U32X2");
}
void EmitLoadGlobal128(EmitContext& ctx, IR::Inst& inst, Register address) {
GlobalLoad(ctx, inst, address, "U32X4");
}
void EmitWriteGlobalU8(EmitContext& ctx, Register address, Register value) {
GlobalWrite(ctx, address, value, "U8");
}
void EmitWriteGlobalS8(EmitContext& ctx, Register address, Register value) {
GlobalWrite(ctx, address, value, "S8");
}
void EmitWriteGlobalU16(EmitContext& ctx, Register address, Register value) {
GlobalWrite(ctx, address, value, "U16");
}
void EmitWriteGlobalS16(EmitContext& ctx, Register address, Register value) {
GlobalWrite(ctx, address, value, "S16");
}
void EmitWriteGlobal32(EmitContext& ctx, Register address, ScalarU32 value) {
GlobalWrite(ctx, address, value, "U32");
}
void EmitWriteGlobal64(EmitContext& ctx, Register address, Register value) {
GlobalWrite(ctx, address, value, "U32X2");
}
void EmitWriteGlobal128(EmitContext& ctx, Register address, Register value) {
GlobalWrite(ctx, address, value, "U32X4");
}
void EmitLoadStorageU8(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset) {
Load(ctx, inst, binding, offset, "U8");
}
void EmitLoadStorageS8(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset) {
Load(ctx, inst, binding, offset, "S8");
}
void EmitLoadStorageU16(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset) {
Load(ctx, inst, binding, offset, "U16");
}
void EmitLoadStorageS16(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset) {
Load(ctx, inst, binding, offset, "S16");
}
void EmitLoadStorage32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset) {
Load(ctx, inst, binding, offset, "U32");
}
void EmitLoadStorage64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset) {
Load(ctx, inst, binding, offset, "U32X2");
}
void EmitLoadStorage128(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset) {
Load(ctx, inst, binding, offset, "U32X4");
}
void EmitWriteStorageU8(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarU32 value) {
Write(ctx, binding, offset, value, "U8");
}
void EmitWriteStorageS8(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarS32 value) {
Write(ctx, binding, offset, value, "S8");
}
void EmitWriteStorageU16(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarU32 value) {
Write(ctx, binding, offset, value, "U16");
}
void EmitWriteStorageS16(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarS32 value) {
Write(ctx, binding, offset, value, "S16");
}
void EmitWriteStorage32(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
ScalarU32 value) {
Write(ctx, binding, offset, value, "U32");
}
void EmitWriteStorage64(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
Register value) {
Write(ctx, binding, offset, value, "U32X2");
}
void EmitWriteStorage128(EmitContext& ctx, const IR::Value& binding, ScalarU32 offset,
Register value) {
Write(ctx, binding, offset, value, "U32X4");
}
void EmitSharedAtomicIAdd32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.ADD.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicSMin32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarS32 value) {
ctx.Add("ATOMS.MIN.S32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicUMin32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.MIN.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicSMax32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarS32 value) {
ctx.Add("ATOMS.MAX.S32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicUMax32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.MAX.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicInc32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.IWRAP.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicDec32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.DWRAP.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicAnd32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.AND.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicOr32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.OR.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicXor32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.XOR.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicExchange32(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
ScalarU32 value) {
ctx.Add("ATOMS.EXCH.U32 {},{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitSharedAtomicExchange64(EmitContext& ctx, IR::Inst& inst, ScalarU32 pointer_offset,
Register value) {
ctx.LongAdd("ATOMS.EXCH.U64 {}.x,{},shared_mem[{}];", inst, value, pointer_offset);
}
void EmitStorageAtomicIAdd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "ADD", "U32");
}
void EmitStorageAtomicSMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarS32 value) {
Atom(ctx, inst, binding, offset, value, "MIN", "S32");
}
void EmitStorageAtomicUMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "MIN", "U32");
}
void EmitStorageAtomicSMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarS32 value) {
Atom(ctx, inst, binding, offset, value, "MAX", "S32");
}
void EmitStorageAtomicUMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "MAX", "U32");
}
void EmitStorageAtomicInc32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "IWRAP", "U32");
}
void EmitStorageAtomicDec32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "DWRAP", "U32");
}
void EmitStorageAtomicAnd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "AND", "U32");
}
void EmitStorageAtomicOr32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "OR", "U32");
}
void EmitStorageAtomicXor32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "XOR", "U32");
}
void EmitStorageAtomicExchange32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarU32 value) {
Atom(ctx, inst, binding, offset, value, "EXCH", "U32");
}
void EmitStorageAtomicIAdd64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "ADD", "U64");
}
void EmitStorageAtomicSMin64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "MIN", "S64");
}
void EmitStorageAtomicUMin64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "MIN", "U64");
}
void EmitStorageAtomicSMax64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "MAX", "S64");
}
void EmitStorageAtomicUMax64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "MAX", "U64");
}
void EmitStorageAtomicAnd64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "AND", "U64");
}
void EmitStorageAtomicOr64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "OR", "U64");
}
void EmitStorageAtomicXor64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "XOR", "U64");
}
void EmitStorageAtomicExchange64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "EXCH", "U64");
}
void EmitStorageAtomicAddF32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, ScalarF32 value) {
Atom(ctx, inst, binding, offset, value, "ADD", "F32");
}
void EmitStorageAtomicAddF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "ADD", "F16x2");
}
void EmitStorageAtomicAddF32x2([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] IR::Inst& inst,
[[maybe_unused]] const IR::Value& binding,
[[maybe_unused]] ScalarU32 offset, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitStorageAtomicMinF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "MIN", "F16x2");
}
void EmitStorageAtomicMinF32x2([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] IR::Inst& inst,
[[maybe_unused]] const IR::Value& binding,
[[maybe_unused]] ScalarU32 offset, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitStorageAtomicMaxF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
ScalarU32 offset, Register value) {
Atom(ctx, inst, binding, offset, value, "MAX", "F16x2");
}
void EmitStorageAtomicMaxF32x2([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] IR::Inst& inst,
[[maybe_unused]] const IR::Value& binding,
[[maybe_unused]] ScalarU32 offset, [[maybe_unused]] Register value) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicIAdd32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicSMin32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicUMin32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicSMax32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicUMax32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicInc32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicDec32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicAnd32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicOr32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicXor32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicExchange32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicIAdd64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicSMin64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicUMin64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicSMax64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicUMax64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicInc64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicDec64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicAnd64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicOr64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicXor64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicExchange64(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicAddF32(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicAddF16x2(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicAddF32x2(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicMinF16x2(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicMinF32x2(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicMaxF16x2(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
void EmitGlobalAtomicMaxF32x2(EmitContext&) {
throw NotImplementedException("GLASM instruction");
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,273 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <string_view>
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/program.h"
#include "shader_recompiler/frontend/ir/value.h"
#ifdef _MSC_VER
#pragma warning(disable : 4100)
#endif
namespace Shader::Backend::GLASM {
#define NotImplemented() throw NotImplementedException("GLASM instruction {}", __LINE__)
static void DefinePhi(EmitContext& ctx, IR::Inst& phi) {
switch (phi.Arg(0).Type()) {
case IR::Type::U1:
case IR::Type::U32:
case IR::Type::F32:
ctx.reg_alloc.Define(phi);
break;
case IR::Type::U64:
case IR::Type::F64:
ctx.reg_alloc.LongDefine(phi);
break;
default:
throw NotImplementedException("Phi node type {}", phi.Type());
}
}
void EmitPhi(EmitContext& ctx, IR::Inst& phi) {
const size_t num_args{phi.NumArgs()};
for (size_t i = 0; i < num_args; ++i) {
ctx.reg_alloc.Consume(phi.Arg(i));
}
if (!phi.Definition<Id>().is_valid) {
// The phi node wasn't forward defined
DefinePhi(ctx, phi);
}
}
void EmitVoid(EmitContext&) {}
void EmitReference(EmitContext& ctx, const IR::Value& value) {
ctx.reg_alloc.Consume(value);
}
void EmitPhiMove(EmitContext& ctx, const IR::Value& phi_value, const IR::Value& value) {
IR::Inst& phi{RegAlloc::AliasInst(*phi_value.Inst())};
if (!phi.Definition<Id>().is_valid) {
// The phi node wasn't forward defined
DefinePhi(ctx, phi);
}
const Register phi_reg{ctx.reg_alloc.Consume(IR::Value{&phi})};
const Value eval_value{ctx.reg_alloc.Consume(value)};
if (phi_reg == eval_value) {
return;
}
switch (phi.Flags<IR::Type>()) {
case IR::Type::U1:
case IR::Type::U32:
case IR::Type::F32:
ctx.Add("MOV.S {}.x,{};", phi_reg, ScalarS32{eval_value});
break;
case IR::Type::U64:
case IR::Type::F64:
ctx.Add("MOV.U64 {}.x,{};", phi_reg, ScalarRegister{eval_value});
break;
default:
throw NotImplementedException("Phi node type {}", phi.Type());
}
}
void EmitJoin(EmitContext& ctx) {
NotImplemented();
}
void EmitDemoteToHelperInvocation(EmitContext& ctx) {
ctx.Add("KIL TR.x;");
}
void EmitBarrier(EmitContext& ctx) {
ctx.Add("BAR;");
}
void EmitWorkgroupMemoryBarrier(EmitContext& ctx) {
ctx.Add("MEMBAR.CTA;");
}
void EmitDeviceMemoryBarrier(EmitContext& ctx) {
ctx.Add("MEMBAR;");
}
void EmitPrologue(EmitContext& ctx) {
// TODO
}
void EmitEpilogue(EmitContext& ctx) {
// TODO
}
void EmitEmitVertex(EmitContext& ctx, ScalarS32 stream) {
if (stream.type == Type::U32 && stream.imm_u32 == 0) {
ctx.Add("EMIT;");
} else {
ctx.Add("EMITS {};", stream);
}
}
void EmitEndPrimitive(EmitContext& ctx, const IR::Value& stream) {
if (!stream.IsImmediate()) {
LOG_WARNING(Shader_GLASM, "Stream is not immediate");
}
ctx.reg_alloc.Consume(stream);
ctx.Add("ENDPRIM;");
}
void EmitGetRegister(EmitContext& ctx) {
NotImplemented();
}
void EmitSetRegister(EmitContext& ctx) {
NotImplemented();
}
void EmitGetPred(EmitContext& ctx) {
NotImplemented();
}
void EmitSetPred(EmitContext& ctx) {
NotImplemented();
}
void EmitSetGotoVariable(EmitContext& ctx) {
NotImplemented();
}
void EmitGetGotoVariable(EmitContext& ctx) {
NotImplemented();
}
void EmitSetIndirectBranchVariable(EmitContext& ctx) {
NotImplemented();
}
void EmitGetIndirectBranchVariable(EmitContext& ctx) {
NotImplemented();
}
void EmitGetZFlag(EmitContext& ctx) {
NotImplemented();
}
void EmitGetSFlag(EmitContext& ctx) {
NotImplemented();
}
void EmitGetCFlag(EmitContext& ctx) {
NotImplemented();
}
void EmitGetOFlag(EmitContext& ctx) {
NotImplemented();
}
void EmitSetZFlag(EmitContext& ctx) {
NotImplemented();
}
void EmitSetSFlag(EmitContext& ctx) {
NotImplemented();
}
void EmitSetCFlag(EmitContext& ctx) {
NotImplemented();
}
void EmitSetOFlag(EmitContext& ctx) {
NotImplemented();
}
void EmitWorkgroupId(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {},invocation.groupid;", inst);
}
void EmitLocalInvocationId(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {},invocation.localid;", inst);
}
void EmitInvocationId(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {}.x,primitive_invocation.x;", inst);
}
void EmitSampleId(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {}.x,fragment.sampleid.x;", inst);
}
void EmitIsHelperInvocation(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {}.x,fragment.helperthread.x;", inst);
}
void EmitYDirection(EmitContext& ctx, IR::Inst& inst) {
ctx.uses_y_direction = true;
ctx.Add("MOV.F {}.x,y_direction[0].w;", inst);
}
void EmitUndefU1(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {}.x,0;", inst);
}
void EmitUndefU8(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {}.x,0;", inst);
}
void EmitUndefU16(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {}.x,0;", inst);
}
void EmitUndefU32(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {}.x,0;", inst);
}
void EmitUndefU64(EmitContext& ctx, IR::Inst& inst) {
ctx.LongAdd("MOV.S64 {}.x,0;", inst);
}
void EmitGetZeroFromOp(EmitContext& ctx) {
NotImplemented();
}
void EmitGetSignFromOp(EmitContext& ctx) {
NotImplemented();
}
void EmitGetCarryFromOp(EmitContext& ctx) {
NotImplemented();
}
void EmitGetOverflowFromOp(EmitContext& ctx) {
NotImplemented();
}
void EmitGetSparseFromOp(EmitContext& ctx) {
NotImplemented();
}
void EmitGetInBoundsFromOp(EmitContext& ctx) {
NotImplemented();
}
void EmitLogicalOr(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
ctx.Add("OR.S {},{},{};", inst, a, b);
}
void EmitLogicalAnd(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
ctx.Add("AND.S {},{},{};", inst, a, b);
}
void EmitLogicalXor(EmitContext& ctx, IR::Inst& inst, ScalarS32 a, ScalarS32 b) {
ctx.Add("XOR.S {},{},{};", inst, a, b);
}
void EmitLogicalNot(EmitContext& ctx, IR::Inst& inst, ScalarS32 value) {
ctx.Add("SEQ.S {},{},0;", inst, value);
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,67 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
void EmitSelectU1(EmitContext& ctx, IR::Inst& inst, ScalarS32 cond, ScalarS32 true_value,
ScalarS32 false_value) {
ctx.Add("CMP.S {},{},{},{};", inst, cond, true_value, false_value);
}
void EmitSelectU8([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] ScalarS32 cond,
[[maybe_unused]] ScalarS32 true_value, [[maybe_unused]] ScalarS32 false_value) {
throw NotImplementedException("GLASM instruction");
}
void EmitSelectU16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] ScalarS32 cond,
[[maybe_unused]] ScalarS32 true_value, [[maybe_unused]] ScalarS32 false_value) {
throw NotImplementedException("GLASM instruction");
}
void EmitSelectU32(EmitContext& ctx, IR::Inst& inst, ScalarS32 cond, ScalarS32 true_value,
ScalarS32 false_value) {
ctx.Add("CMP.S {},{},{},{};", inst, cond, true_value, false_value);
}
void EmitSelectU64(EmitContext& ctx, IR::Inst& inst, ScalarS32 cond, Register true_value,
Register false_value) {
ctx.reg_alloc.InvalidateConditionCodes();
const Register ret{ctx.reg_alloc.LongDefine(inst)};
if (ret == true_value) {
ctx.Add("MOV.S.CC RC.x,{};"
"MOV.U64 {}.x(EQ.x),{};",
cond, ret, false_value);
} else if (ret == false_value) {
ctx.Add("MOV.S.CC RC.x,{};"
"MOV.U64 {}.x(NE.x),{};",
cond, ret, true_value);
} else {
ctx.Add("MOV.S.CC RC.x,{};"
"MOV.U64 {}.x,{};"
"MOV.U64 {}.x(NE.x),{};",
cond, ret, false_value, ret, true_value);
}
}
void EmitSelectF16([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] ScalarS32 cond,
[[maybe_unused]] Register true_value, [[maybe_unused]] Register false_value) {
throw NotImplementedException("GLASM instruction");
}
void EmitSelectF32(EmitContext& ctx, IR::Inst& inst, ScalarS32 cond, ScalarS32 true_value,
ScalarS32 false_value) {
ctx.Add("CMP.S {},{},{},{};", inst, cond, true_value, false_value);
}
void EmitSelectF64([[maybe_unused]] EmitContext& ctx, [[maybe_unused]] ScalarS32 cond,
[[maybe_unused]] Register true_value, [[maybe_unused]] Register false_value) {
throw NotImplementedException("GLASM instruction");
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,58 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
void EmitLoadSharedU8(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset) {
ctx.Add("LDS.U8 {},shared_mem[{}];", inst, offset);
}
void EmitLoadSharedS8(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset) {
ctx.Add("LDS.S8 {},shared_mem[{}];", inst, offset);
}
void EmitLoadSharedU16(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset) {
ctx.Add("LDS.U16 {},shared_mem[{}];", inst, offset);
}
void EmitLoadSharedS16(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset) {
ctx.Add("LDS.S16 {},shared_mem[{}];", inst, offset);
}
void EmitLoadSharedU32(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset) {
ctx.Add("LDS.U32 {},shared_mem[{}];", inst, offset);
}
void EmitLoadSharedU64(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset) {
ctx.Add("LDS.U32X2 {},shared_mem[{}];", inst, offset);
}
void EmitLoadSharedU128(EmitContext& ctx, IR::Inst& inst, ScalarU32 offset) {
ctx.Add("LDS.U32X4 {},shared_mem[{}];", inst, offset);
}
void EmitWriteSharedU8(EmitContext& ctx, ScalarU32 offset, ScalarU32 value) {
ctx.Add("STS.U8 {},shared_mem[{}];", value, offset);
}
void EmitWriteSharedU16(EmitContext& ctx, ScalarU32 offset, ScalarU32 value) {
ctx.Add("STS.U16 {},shared_mem[{}];", value, offset);
}
void EmitWriteSharedU32(EmitContext& ctx, ScalarU32 offset, ScalarU32 value) {
ctx.Add("STS.U32 {},shared_mem[{}];", value, offset);
}
void EmitWriteSharedU64(EmitContext& ctx, ScalarU32 offset, Register value) {
ctx.Add("STS.U32X2 {},shared_mem[{}];", value, offset);
}
void EmitWriteSharedU128(EmitContext& ctx, ScalarU32 offset, Register value) {
ctx.Add("STS.U32X4 {},shared_mem[{}];", value, offset);
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,150 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/emit_glasm_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
#include "shader_recompiler/profile.h"
namespace Shader::Backend::GLASM {
void EmitLaneId(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.S {}.x,{}.threadid;", inst, ctx.stage_name);
}
void EmitVoteAll(EmitContext& ctx, IR::Inst& inst, ScalarS32 pred) {
ctx.Add("TGALL.S {}.x,{};", inst, pred);
}
void EmitVoteAny(EmitContext& ctx, IR::Inst& inst, ScalarS32 pred) {
ctx.Add("TGANY.S {}.x,{};", inst, pred);
}
void EmitVoteEqual(EmitContext& ctx, IR::Inst& inst, ScalarS32 pred) {
ctx.Add("TGEQ.S {}.x,{};", inst, pred);
}
void EmitSubgroupBallot(EmitContext& ctx, IR::Inst& inst, ScalarS32 pred) {
ctx.Add("TGBALLOT {}.x,{};", inst, pred);
}
void EmitSubgroupEqMask(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.U {},{}.threadeqmask;", inst, ctx.stage_name);
}
void EmitSubgroupLtMask(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.U {},{}.threadltmask;", inst, ctx.stage_name);
}
void EmitSubgroupLeMask(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.U {},{}.threadlemask;", inst, ctx.stage_name);
}
void EmitSubgroupGtMask(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.U {},{}.threadgtmask;", inst, ctx.stage_name);
}
void EmitSubgroupGeMask(EmitContext& ctx, IR::Inst& inst) {
ctx.Add("MOV.U {},{}.threadgemask;", inst, ctx.stage_name);
}
static void Shuffle(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask,
std::string_view op) {
IR::Inst* const in_bounds{inst.GetAssociatedPseudoOperation(IR::Opcode::GetInBoundsFromOp)};
if (in_bounds) {
in_bounds->Invalidate();
}
std::string mask;
if (clamp.IsImmediate() && segmentation_mask.IsImmediate()) {
mask = fmt::to_string(clamp.U32() | (segmentation_mask.U32() << 8));
} else {
mask = "RC";
ctx.Add("BFI.U RC.x,{{5,8,0,0}},{},{};",
ScalarU32{ctx.reg_alloc.Consume(segmentation_mask)},
ScalarU32{ctx.reg_alloc.Consume(clamp)});
}
const Register value_ret{ctx.reg_alloc.Define(inst)};
if (in_bounds) {
const Register bounds_ret{ctx.reg_alloc.Define(*in_bounds)};
ctx.Add("SHF{}.U {},{},{},{};"
"MOV.U {}.x,{}.y;",
op, bounds_ret, value, index, mask, value_ret, bounds_ret);
} else {
ctx.Add("SHF{}.U {},{},{},{};"
"MOV.U {}.x,{}.y;",
op, value_ret, value, index, mask, value_ret, value_ret);
}
}
void EmitShuffleIndex(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask) {
Shuffle(ctx, inst, value, index, clamp, segmentation_mask, "IDX");
}
void EmitShuffleUp(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask) {
Shuffle(ctx, inst, value, index, clamp, segmentation_mask, "UP");
}
void EmitShuffleDown(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask) {
Shuffle(ctx, inst, value, index, clamp, segmentation_mask, "DOWN");
}
void EmitShuffleButterfly(EmitContext& ctx, IR::Inst& inst, ScalarU32 value, ScalarU32 index,
const IR::Value& clamp, const IR::Value& segmentation_mask) {
Shuffle(ctx, inst, value, index, clamp, segmentation_mask, "XOR");
}
void EmitFSwizzleAdd(EmitContext& ctx, IR::Inst& inst, ScalarF32 op_a, ScalarF32 op_b,
ScalarU32 swizzle) {
const auto ret{ctx.reg_alloc.Define(inst)};
ctx.Add("AND.U RC.z,{}.threadid,3;"
"SHL.U RC.z,RC.z,1;"
"SHR.U RC.z,{},RC.z;"
"AND.U RC.z,RC.z,3;"
"MUL.F RC.x,{},FSWZA[RC.z];"
"MUL.F RC.y,{},FSWZB[RC.z];"
"ADD.F {}.x,RC.x,RC.y;",
ctx.stage_name, swizzle, op_a, op_b, ret);
}
void EmitDPdxFine(EmitContext& ctx, IR::Inst& inst, ScalarF32 p) {
if (ctx.profile.support_derivative_control) {
ctx.Add("DDX.FINE {}.x,{};", inst, p);
} else {
LOG_WARNING(Shader_GLASM, "Fine derivatives not supported by device");
ctx.Add("DDX {}.x,{};", inst, p);
}
}
void EmitDPdyFine(EmitContext& ctx, IR::Inst& inst, ScalarF32 p) {
if (ctx.profile.support_derivative_control) {
ctx.Add("DDY.FINE {}.x,{};", inst, p);
} else {
LOG_WARNING(Shader_GLASM, "Fine derivatives not supported by device");
ctx.Add("DDY {}.x,{};", inst, p);
}
}
void EmitDPdxCoarse(EmitContext& ctx, IR::Inst& inst, ScalarF32 p) {
if (ctx.profile.support_derivative_control) {
ctx.Add("DDX.COARSE {}.x,{};", inst, p);
} else {
LOG_WARNING(Shader_GLASM, "Coarse derivatives not supported by device");
ctx.Add("DDX {}.x,{};", inst, p);
}
}
void EmitDPdyCoarse(EmitContext& ctx, IR::Inst& inst, ScalarF32 p) {
if (ctx.profile.support_derivative_control) {
ctx.Add("DDY.COARSE {}.x,{};", inst, p);
} else {
LOG_WARNING(Shader_GLASM, "Coarse derivatives not supported by device");
ctx.Add("DDY {}.x,{};", inst, p);
}
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,186 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <string>
#include <fmt/format.h>
#include "shader_recompiler/backend/glasm/emit_context.h"
#include "shader_recompiler/backend/glasm/reg_alloc.h"
#include "shader_recompiler/exception.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLASM {
Register RegAlloc::Define(IR::Inst& inst) {
return Define(inst, false);
}
Register RegAlloc::LongDefine(IR::Inst& inst) {
return Define(inst, true);
}
Value RegAlloc::Peek(const IR::Value& value) {
if (value.IsImmediate()) {
return MakeImm(value);
} else {
return PeekInst(*value.Inst());
}
}
Value RegAlloc::Consume(const IR::Value& value) {
if (value.IsImmediate()) {
return MakeImm(value);
} else {
return ConsumeInst(*value.Inst());
}
}
void RegAlloc::Unref(IR::Inst& inst) {
IR::Inst& value_inst{AliasInst(inst)};
value_inst.DestructiveRemoveUsage();
if (!value_inst.HasUses()) {
Free(value_inst.Definition<Id>());
}
}
Register RegAlloc::AllocReg() {
Register ret;
ret.type = Type::Register;
ret.id = Alloc(false);
return ret;
}
Register RegAlloc::AllocLongReg() {
Register ret;
ret.type = Type::Register;
ret.id = Alloc(true);
return ret;
}
void RegAlloc::FreeReg(Register reg) {
Free(reg.id);
}
Value RegAlloc::MakeImm(const IR::Value& value) {
Value ret;
switch (value.Type()) {
case IR::Type::Void:
ret.type = Type::Void;
break;
case IR::Type::U1:
ret.type = Type::U32;
ret.imm_u32 = value.U1() ? 0xffffffff : 0;
break;
case IR::Type::U32:
ret.type = Type::U32;
ret.imm_u32 = value.U32();
break;
case IR::Type::F32:
ret.type = Type::U32;
ret.imm_u32 = Common::BitCast<u32>(value.F32());
break;
case IR::Type::U64:
ret.type = Type::U64;
ret.imm_u64 = value.U64();
break;
case IR::Type::F64:
ret.type = Type::U64;
ret.imm_u64 = Common::BitCast<u64>(value.F64());
break;
default:
throw NotImplementedException("Immediate type {}", value.Type());
}
return ret;
}
Register RegAlloc::Define(IR::Inst& inst, bool is_long) {
if (inst.HasUses()) {
inst.SetDefinition<Id>(Alloc(is_long));
} else {
Id id{};
id.is_long.Assign(is_long ? 1 : 0);
id.is_null.Assign(1);
inst.SetDefinition<Id>(id);
}
return Register{PeekInst(inst)};
}
Value RegAlloc::PeekInst(IR::Inst& inst) {
Value ret;
ret.type = Type::Register;
ret.id = inst.Definition<Id>();
return ret;
}
Value RegAlloc::ConsumeInst(IR::Inst& inst) {
Unref(inst);
return PeekInst(inst);
}
Id RegAlloc::Alloc(bool is_long) {
size_t& num_regs{is_long ? num_used_long_registers : num_used_registers};
std::bitset<NUM_REGS>& use{is_long ? long_register_use : register_use};
if (num_used_registers + num_used_long_registers < NUM_REGS) {
for (size_t reg = 0; reg < NUM_REGS; ++reg) {
if (use[reg]) {
continue;
}
num_regs = std::max(num_regs, reg + 1);
use[reg] = true;
Id ret{};
ret.is_valid.Assign(1);
ret.is_long.Assign(is_long ? 1 : 0);
ret.is_spill.Assign(0);
ret.is_condition_code.Assign(0);
ret.is_null.Assign(0);
ret.index.Assign(static_cast<u32>(reg));
return ret;
}
}
throw NotImplementedException("Register spilling");
}
void RegAlloc::Free(Id id) {
if (id.is_valid == 0) {
throw LogicError("Freeing invalid register");
}
if (id.is_spill != 0) {
throw NotImplementedException("Free spill");
}
if (id.is_long != 0) {
long_register_use[id.index] = false;
} else {
register_use[id.index] = false;
}
}
/*static*/ bool RegAlloc::IsAliased(const IR::Inst& inst) {
switch (inst.GetOpcode()) {
case IR::Opcode::Identity:
case IR::Opcode::BitCastU16F16:
case IR::Opcode::BitCastU32F32:
case IR::Opcode::BitCastU64F64:
case IR::Opcode::BitCastF16U16:
case IR::Opcode::BitCastF32U32:
case IR::Opcode::BitCastF64U64:
return true;
default:
return false;
}
}
/*static*/ IR::Inst& RegAlloc::AliasInst(IR::Inst& inst) {
IR::Inst* it{&inst};
while (IsAliased(*it)) {
const IR::Value arg{it->Arg(0)};
if (arg.IsImmediate()) {
break;
}
it = arg.InstRecursive();
}
return *it;
}
} // namespace Shader::Backend::GLASM

View File

@@ -0,0 +1,303 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <bitset>
#include <fmt/format.h>
#include "common/bit_cast.h"
#include "common/bit_field.h"
#include "common/common_types.h"
#include "shader_recompiler/exception.h"
namespace Shader::IR {
class Inst;
class Value;
} // namespace Shader::IR
namespace Shader::Backend::GLASM {
class EmitContext;
enum class Type : u32 {
Void,
Register,
U32,
U64,
};
struct Id {
union {
u32 raw;
BitField<0, 1, u32> is_valid;
BitField<1, 1, u32> is_long;
BitField<2, 1, u32> is_spill;
BitField<3, 1, u32> is_condition_code;
BitField<4, 1, u32> is_null;
BitField<5, 27, u32> index;
};
bool operator==(Id rhs) const noexcept {
return raw == rhs.raw;
}
bool operator!=(Id rhs) const noexcept {
return !operator==(rhs);
}
};
static_assert(sizeof(Id) == sizeof(u32));
struct Value {
Type type;
union {
Id id;
u32 imm_u32;
u64 imm_u64;
};
bool operator==(const Value& rhs) const noexcept {
if (type != rhs.type) {
return false;
}
switch (type) {
case Type::Void:
return true;
case Type::Register:
return id == rhs.id;
case Type::U32:
return imm_u32 == rhs.imm_u32;
case Type::U64:
return imm_u64 == rhs.imm_u64;
}
return false;
}
bool operator!=(const Value& rhs) const noexcept {
return !operator==(rhs);
}
};
struct Register : Value {};
struct ScalarRegister : Value {};
struct ScalarU32 : Value {};
struct ScalarS32 : Value {};
struct ScalarF32 : Value {};
struct ScalarF64 : Value {};
class RegAlloc {
public:
RegAlloc() = default;
Register Define(IR::Inst& inst);
Register LongDefine(IR::Inst& inst);
[[nodiscard]] Value Peek(const IR::Value& value);
Value Consume(const IR::Value& value);
void Unref(IR::Inst& inst);
[[nodiscard]] Register AllocReg();
[[nodiscard]] Register AllocLongReg();
void FreeReg(Register reg);
void InvalidateConditionCodes() {
// This does nothing for now
}
[[nodiscard]] size_t NumUsedRegisters() const noexcept {
return num_used_registers;
}
[[nodiscard]] size_t NumUsedLongRegisters() const noexcept {
return num_used_long_registers;
}
[[nodiscard]] bool IsEmpty() const noexcept {
return register_use.none() && long_register_use.none();
}
/// Returns true if the instruction is expected to be aliased to another
static bool IsAliased(const IR::Inst& inst);
/// Returns the underlying value out of an alias sequence
static IR::Inst& AliasInst(IR::Inst& inst);
private:
static constexpr size_t NUM_REGS = 4096;
static constexpr size_t NUM_ELEMENTS = 4;
Value MakeImm(const IR::Value& value);
Register Define(IR::Inst& inst, bool is_long);
Value PeekInst(IR::Inst& inst);
Value ConsumeInst(IR::Inst& inst);
Id Alloc(bool is_long);
void Free(Id id);
size_t num_used_registers{};
size_t num_used_long_registers{};
std::bitset<NUM_REGS> register_use{};
std::bitset<NUM_REGS> long_register_use{};
};
template <bool scalar, typename FormatContext>
auto FormatTo(FormatContext& ctx, Id id) {
if (id.is_condition_code != 0) {
throw NotImplementedException("Condition code emission");
}
if (id.is_spill != 0) {
throw NotImplementedException("Spill emission");
}
if constexpr (scalar) {
if (id.is_null != 0) {
return fmt::format_to(ctx.out(), "{}", id.is_long != 0 ? "DC.x" : "RC.x");
}
if (id.is_long != 0) {
return fmt::format_to(ctx.out(), "D{}.x", id.index.Value());
} else {
return fmt::format_to(ctx.out(), "R{}.x", id.index.Value());
}
} else {
if (id.is_null != 0) {
return fmt::format_to(ctx.out(), "{}", id.is_long != 0 ? "DC" : "RC");
}
if (id.is_long != 0) {
return fmt::format_to(ctx.out(), "D{}", id.index.Value());
} else {
return fmt::format_to(ctx.out(), "R{}", id.index.Value());
}
}
}
} // namespace Shader::Backend::GLASM
template <>
struct fmt::formatter<Shader::Backend::GLASM::Id> {
constexpr auto parse(format_parse_context& ctx) {
return ctx.begin();
}
template <typename FormatContext>
auto format(Shader::Backend::GLASM::Id id, FormatContext& ctx) {
return Shader::Backend::GLASM::FormatTo<true>(ctx, id);
}
};
template <>
struct fmt::formatter<Shader::Backend::GLASM::Register> {
constexpr auto parse(format_parse_context& ctx) {
return ctx.begin();
}
template <typename FormatContext>
auto format(const Shader::Backend::GLASM::Register& value, FormatContext& ctx) {
if (value.type != Shader::Backend::GLASM::Type::Register) {
throw Shader::InvalidArgument("Register value type is not register");
}
return Shader::Backend::GLASM::FormatTo<false>(ctx, value.id);
}
};
template <>
struct fmt::formatter<Shader::Backend::GLASM::ScalarRegister> {
constexpr auto parse(format_parse_context& ctx) {
return ctx.begin();
}
template <typename FormatContext>
auto format(const Shader::Backend::GLASM::ScalarRegister& value, FormatContext& ctx) {
if (value.type != Shader::Backend::GLASM::Type::Register) {
throw Shader::InvalidArgument("Register value type is not register");
}
return Shader::Backend::GLASM::FormatTo<true>(ctx, value.id);
}
};
template <>
struct fmt::formatter<Shader::Backend::GLASM::ScalarU32> {
constexpr auto parse(format_parse_context& ctx) {
return ctx.begin();
}
template <typename FormatContext>
auto format(const Shader::Backend::GLASM::ScalarU32& value, FormatContext& ctx) {
switch (value.type) {
case Shader::Backend::GLASM::Type::Void:
break;
case Shader::Backend::GLASM::Type::Register:
return Shader::Backend::GLASM::FormatTo<true>(ctx, value.id);
case Shader::Backend::GLASM::Type::U32:
return fmt::format_to(ctx.out(), "{}", value.imm_u32);
case Shader::Backend::GLASM::Type::U64:
break;
}
throw Shader::InvalidArgument("Invalid value type {}", value.type);
}
};
template <>
struct fmt::formatter<Shader::Backend::GLASM::ScalarS32> {
constexpr auto parse(format_parse_context& ctx) {
return ctx.begin();
}
template <typename FormatContext>
auto format(const Shader::Backend::GLASM::ScalarS32& value, FormatContext& ctx) {
switch (value.type) {
case Shader::Backend::GLASM::Type::Void:
break;
case Shader::Backend::GLASM::Type::Register:
return Shader::Backend::GLASM::FormatTo<true>(ctx, value.id);
case Shader::Backend::GLASM::Type::U32:
return fmt::format_to(ctx.out(), "{}", static_cast<s32>(value.imm_u32));
case Shader::Backend::GLASM::Type::U64:
break;
}
throw Shader::InvalidArgument("Invalid value type {}", value.type);
}
};
template <>
struct fmt::formatter<Shader::Backend::GLASM::ScalarF32> {
constexpr auto parse(format_parse_context& ctx) {
return ctx.begin();
}
template <typename FormatContext>
auto format(const Shader::Backend::GLASM::ScalarF32& value, FormatContext& ctx) {
switch (value.type) {
case Shader::Backend::GLASM::Type::Void:
break;
case Shader::Backend::GLASM::Type::Register:
return Shader::Backend::GLASM::FormatTo<true>(ctx, value.id);
case Shader::Backend::GLASM::Type::U32:
return fmt::format_to(ctx.out(), "{}", Common::BitCast<f32>(value.imm_u32));
case Shader::Backend::GLASM::Type::U64:
break;
}
throw Shader::InvalidArgument("Invalid value type {}", value.type);
}
};
template <>
struct fmt::formatter<Shader::Backend::GLASM::ScalarF64> {
constexpr auto parse(format_parse_context& ctx) {
return ctx.begin();
}
template <typename FormatContext>
auto format(const Shader::Backend::GLASM::ScalarF64& value, FormatContext& ctx) {
switch (value.type) {
case Shader::Backend::GLASM::Type::Void:
break;
case Shader::Backend::GLASM::Type::Register:
return Shader::Backend::GLASM::FormatTo<true>(ctx, value.id);
case Shader::Backend::GLASM::Type::U32:
break;
case Shader::Backend::GLASM::Type::U64:
return fmt::format_to(ctx.out(), "{}", Common::BitCast<f64>(value.imm_u64));
}
throw Shader::InvalidArgument("Invalid value type {}", value.type);
}
};

View File

@@ -0,0 +1,715 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "shader_recompiler/backend/bindings.h"
#include "shader_recompiler/backend/glsl/emit_context.h"
#include "shader_recompiler/frontend/ir/program.h"
#include "shader_recompiler/profile.h"
#include "shader_recompiler/runtime_info.h"
namespace Shader::Backend::GLSL {
namespace {
u32 CbufIndex(size_t offset) {
return (offset / 4) % 4;
}
char Swizzle(size_t offset) {
return "xyzw"[CbufIndex(offset)];
}
std::string_view InterpDecorator(Interpolation interp) {
switch (interp) {
case Interpolation::Smooth:
return "";
case Interpolation::Flat:
return "flat ";
case Interpolation::NoPerspective:
return "noperspective ";
}
throw InvalidArgument("Invalid interpolation {}", interp);
}
std::string_view InputArrayDecorator(Stage stage) {
switch (stage) {
case Stage::Geometry:
case Stage::TessellationControl:
case Stage::TessellationEval:
return "[]";
default:
return "";
}
}
bool StoresPerVertexAttributes(Stage stage) {
switch (stage) {
case Stage::VertexA:
case Stage::VertexB:
case Stage::Geometry:
case Stage::TessellationEval:
return true;
default:
return false;
}
}
std::string OutputDecorator(Stage stage, u32 size) {
switch (stage) {
case Stage::TessellationControl:
return fmt::format("[{}]", size);
default:
return "";
}
}
std::string_view SamplerType(TextureType type, bool is_depth) {
if (is_depth) {
switch (type) {
case TextureType::Color1D:
return "sampler1DShadow";
case TextureType::ColorArray1D:
return "sampler1DArrayShadow";
case TextureType::Color2D:
return "sampler2DShadow";
case TextureType::ColorArray2D:
return "sampler2DArrayShadow";
case TextureType::ColorCube:
return "samplerCubeShadow";
case TextureType::ColorArrayCube:
return "samplerCubeArrayShadow";
default:
throw NotImplementedException("Texture type: {}", type);
}
}
switch (type) {
case TextureType::Color1D:
return "sampler1D";
case TextureType::ColorArray1D:
return "sampler1DArray";
case TextureType::Color2D:
return "sampler2D";
case TextureType::ColorArray2D:
return "sampler2DArray";
case TextureType::Color3D:
return "sampler3D";
case TextureType::ColorCube:
return "samplerCube";
case TextureType::ColorArrayCube:
return "samplerCubeArray";
case TextureType::Buffer:
return "samplerBuffer";
default:
throw NotImplementedException("Texture type: {}", type);
}
}
std::string_view ImageType(TextureType type) {
switch (type) {
case TextureType::Color1D:
return "uimage1D";
case TextureType::ColorArray1D:
return "uimage1DArray";
case TextureType::Color2D:
return "uimage2D";
case TextureType::ColorArray2D:
return "uimage2DArray";
case TextureType::Color3D:
return "uimage3D";
case TextureType::ColorCube:
return "uimageCube";
case TextureType::ColorArrayCube:
return "uimageCubeArray";
case TextureType::Buffer:
return "uimageBuffer";
default:
throw NotImplementedException("Image type: {}", type);
}
}
std::string_view ImageFormatString(ImageFormat format) {
switch (format) {
case ImageFormat::Typeless:
return "";
case ImageFormat::R8_UINT:
return ",r8ui";
case ImageFormat::R8_SINT:
return ",r8i";
case ImageFormat::R16_UINT:
return ",r16ui";
case ImageFormat::R16_SINT:
return ",r16i";
case ImageFormat::R32_UINT:
return ",r32ui";
case ImageFormat::R32G32_UINT:
return ",rg32ui";
case ImageFormat::R32G32B32A32_UINT:
return ",rgba32ui";
default:
throw NotImplementedException("Image format: {}", format);
}
}
std::string_view ImageAccessQualifier(bool is_written, bool is_read) {
if (is_written && !is_read) {
return "writeonly ";
}
if (is_read && !is_written) {
return "readonly ";
}
return "";
}
std::string_view GetTessMode(TessPrimitive primitive) {
switch (primitive) {
case TessPrimitive::Triangles:
return "triangles";
case TessPrimitive::Quads:
return "quads";
case TessPrimitive::Isolines:
return "isolines";
}
throw InvalidArgument("Invalid tessellation primitive {}", primitive);
}
std::string_view GetTessSpacing(TessSpacing spacing) {
switch (spacing) {
case TessSpacing::Equal:
return "equal_spacing";
case TessSpacing::FractionalOdd:
return "fractional_odd_spacing";
case TessSpacing::FractionalEven:
return "fractional_even_spacing";
}
throw InvalidArgument("Invalid tessellation spacing {}", spacing);
}
std::string_view InputPrimitive(InputTopology topology) {
switch (topology) {
case InputTopology::Points:
return "points";
case InputTopology::Lines:
return "lines";
case InputTopology::LinesAdjacency:
return "lines_adjacency";
case InputTopology::Triangles:
return "triangles";
case InputTopology::TrianglesAdjacency:
return "triangles_adjacency";
}
throw InvalidArgument("Invalid input topology {}", topology);
}
std::string_view OutputPrimitive(OutputTopology topology) {
switch (topology) {
case OutputTopology::PointList:
return "points";
case OutputTopology::LineStrip:
return "line_strip";
case OutputTopology::TriangleStrip:
return "triangle_strip";
}
throw InvalidArgument("Invalid output topology {}", topology);
}
void SetupLegacyOutPerVertex(EmitContext& ctx, std::string& header) {
if (!ctx.info.stores.Legacy()) {
return;
}
if (ctx.info.stores.FixedFunctionTexture()) {
header += "vec4 gl_TexCoord[8];";
}
if (ctx.info.stores.AnyComponent(IR::Attribute::ColorFrontDiffuseR)) {
header += "vec4 gl_FrontColor;";
}
if (ctx.info.stores.AnyComponent(IR::Attribute::ColorFrontSpecularR)) {
header += "vec4 gl_FrontSecondaryColor;";
}
if (ctx.info.stores.AnyComponent(IR::Attribute::ColorBackDiffuseR)) {
header += "vec4 gl_BackColor;";
}
if (ctx.info.stores.AnyComponent(IR::Attribute::ColorBackSpecularR)) {
header += "vec4 gl_BackSecondaryColor;";
}
}
void SetupOutPerVertex(EmitContext& ctx, std::string& header) {
if (!StoresPerVertexAttributes(ctx.stage)) {
return;
}
if (ctx.uses_geometry_passthrough) {
return;
}
header += "out gl_PerVertex{vec4 gl_Position;";
if (ctx.info.stores[IR::Attribute::PointSize]) {
header += "float gl_PointSize;";
}
if (ctx.info.stores.ClipDistances()) {
header += "float gl_ClipDistance[];";
}
if (ctx.info.stores[IR::Attribute::ViewportIndex] &&
ctx.profile.support_viewport_index_layer_non_geometry && ctx.stage != Stage::Geometry) {
header += "int gl_ViewportIndex;";
}
SetupLegacyOutPerVertex(ctx, header);
header += "};";
if (ctx.info.stores[IR::Attribute::ViewportIndex] && ctx.stage == Stage::Geometry) {
header += "out int gl_ViewportIndex;";
}
}
void SetupInPerVertex(EmitContext& ctx, std::string& header) {
// Currently only required for TessellationControl to adhere to
// ARB_separate_shader_objects requirements
if (ctx.stage != Stage::TessellationControl) {
return;
}
const bool loads_position{ctx.info.loads.AnyComponent(IR::Attribute::PositionX)};
const bool loads_point_size{ctx.info.loads[IR::Attribute::PointSize]};
const bool loads_clip_distance{ctx.info.loads.ClipDistances()};
const bool loads_per_vertex{loads_position || loads_point_size || loads_clip_distance};
if (!loads_per_vertex) {
return;
}
header += "in gl_PerVertex{";
if (loads_position) {
header += "vec4 gl_Position;";
}
if (loads_point_size) {
header += "float gl_PointSize;";
}
if (loads_clip_distance) {
header += "float gl_ClipDistance[];";
}
header += "}gl_in[gl_MaxPatchVertices];";
}
void SetupLegacyInPerFragment(EmitContext& ctx, std::string& header) {
if (!ctx.info.loads.Legacy()) {
return;
}
header += "in gl_PerFragment{";
if (ctx.info.loads.FixedFunctionTexture()) {
header += "vec4 gl_TexCoord[8];";
}
if (ctx.info.loads.AnyComponent(IR::Attribute::ColorFrontDiffuseR)) {
header += "vec4 gl_Color;";
}
header += "};";
}
} // Anonymous namespace
EmitContext::EmitContext(IR::Program& program, Bindings& bindings, const Profile& profile_,
const RuntimeInfo& runtime_info_)
: info{program.info}, profile{profile_}, runtime_info{runtime_info_}, stage{program.stage},
uses_geometry_passthrough{program.is_geometry_passthrough &&
profile.support_geometry_shader_passthrough} {
if (profile.need_fastmath_off) {
header += "#pragma optionNV(fastmath off)\n";
}
SetupExtensions();
switch (program.stage) {
case Stage::VertexA:
case Stage::VertexB:
stage_name = "vs";
break;
case Stage::TessellationControl:
stage_name = "tcs";
header += fmt::format("layout(vertices={})out;", program.invocations);
break;
case Stage::TessellationEval:
stage_name = "tes";
header += fmt::format("layout({},{},{})in;", GetTessMode(runtime_info.tess_primitive),
GetTessSpacing(runtime_info.tess_spacing),
runtime_info.tess_clockwise ? "cw" : "ccw");
break;
case Stage::Geometry:
stage_name = "gs";
header += fmt::format("layout({})in;", InputPrimitive(runtime_info.input_topology));
if (uses_geometry_passthrough) {
header += "layout(passthrough)in gl_PerVertex{vec4 gl_Position;};";
break;
} else if (program.is_geometry_passthrough &&
!profile.support_geometry_shader_passthrough) {
LOG_WARNING(Shader_GLSL, "Passthrough geometry program used but not supported");
}
header += fmt::format(
"layout({},max_vertices={})out;in gl_PerVertex{{vec4 gl_Position;}}gl_in[];",
OutputPrimitive(program.output_topology), program.output_vertices);
break;
case Stage::Fragment:
stage_name = "fs";
position_name = "gl_FragCoord";
if (runtime_info.force_early_z) {
header += "layout(early_fragment_tests)in;";
}
if (info.uses_sample_id) {
header += "in int gl_SampleID;";
}
if (info.stores_sample_mask) {
header += "out int gl_SampleMask[];";
}
break;
case Stage::Compute:
stage_name = "cs";
const u32 local_x{std::max(program.workgroup_size[0], 1u)};
const u32 local_y{std::max(program.workgroup_size[1], 1u)};
const u32 local_z{std::max(program.workgroup_size[2], 1u)};
header += fmt::format("layout(local_size_x={},local_size_y={},local_size_z={}) in;",
local_x, local_y, local_z);
break;
}
SetupOutPerVertex(*this, header);
SetupInPerVertex(*this, header);
SetupLegacyInPerFragment(*this, header);
for (size_t index = 0; index < IR::NUM_GENERICS; ++index) {
if (!info.loads.Generic(index) || !runtime_info.previous_stage_stores.Generic(index)) {
continue;
}
const auto qualifier{uses_geometry_passthrough ? "passthrough"
: fmt::format("location={}", index)};
header += fmt::format("layout({}){}in vec4 in_attr{}{};", qualifier,
InterpDecorator(info.interpolation[index]), index,
InputArrayDecorator(stage));
}
for (size_t index = 0; index < info.uses_patches.size(); ++index) {
if (!info.uses_patches[index]) {
continue;
}
const auto qualifier{stage == Stage::TessellationControl ? "out" : "in"};
header += fmt::format("layout(location={})patch {} vec4 patch{};", index, qualifier, index);
}
if (stage == Stage::Fragment) {
for (size_t index = 0; index < info.stores_frag_color.size(); ++index) {
if (!info.stores_frag_color[index] && !profile.need_declared_frag_colors) {
continue;
}
header += fmt::format("layout(location={})out vec4 frag_color{};", index, index);
}
}
for (size_t index = 0; index < IR::NUM_GENERICS; ++index) {
if (info.stores.Generic(index)) {
DefineGenericOutput(index, program.invocations);
}
}
DefineConstantBuffers(bindings);
DefineStorageBuffers(bindings);
SetupImages(bindings);
SetupTextures(bindings);
DefineHelperFunctions();
DefineConstants();
}
void EmitContext::SetupExtensions() {
header += "#extension GL_ARB_separate_shader_objects : enable\n";
if (info.uses_shadow_lod && profile.support_gl_texture_shadow_lod) {
header += "#extension GL_EXT_texture_shadow_lod : enable\n";
}
if (info.uses_int64 && profile.support_int64) {
header += "#extension GL_ARB_gpu_shader_int64 : enable\n";
}
if (info.uses_int64_bit_atomics) {
header += "#extension GL_NV_shader_atomic_int64 : enable\n";
}
if (info.uses_atomic_f32_add) {
header += "#extension GL_NV_shader_atomic_float : enable\n";
}
if (info.uses_atomic_f16x2_add || info.uses_atomic_f16x2_min || info.uses_atomic_f16x2_max) {
header += "#extension GL_NV_shader_atomic_fp16_vector : enable\n";
}
if (info.uses_fp16) {
if (profile.support_gl_nv_gpu_shader_5) {
header += "#extension GL_NV_gpu_shader5 : enable\n";
}
if (profile.support_gl_amd_gpu_shader_half_float) {
header += "#extension GL_AMD_gpu_shader_half_float : enable\n";
}
}
if (info.uses_subgroup_invocation_id || info.uses_subgroup_mask || info.uses_subgroup_vote ||
info.uses_subgroup_shuffles || info.uses_fswzadd) {
header += "#extension GL_ARB_shader_ballot : enable\n"
"#extension GL_ARB_shader_group_vote : enable\n";
if (!info.uses_int64 && profile.support_int64) {
header += "#extension GL_ARB_gpu_shader_int64 : enable\n";
}
if (profile.support_gl_warp_intrinsics) {
header += "#extension GL_NV_shader_thread_shuffle : enable\n";
}
}
if ((info.stores[IR::Attribute::ViewportIndex] || info.stores[IR::Attribute::Layer]) &&
profile.support_viewport_index_layer_non_geometry && stage != Stage::Geometry) {
header += "#extension GL_ARB_shader_viewport_layer_array : enable\n";
}
if (info.uses_sparse_residency && profile.support_gl_sparse_textures) {
header += "#extension GL_ARB_sparse_texture2 : enable\n";
}
if (info.stores[IR::Attribute::ViewportMask] && profile.support_viewport_mask) {
header += "#extension GL_NV_viewport_array2 : enable\n";
}
if (info.uses_typeless_image_reads) {
header += "#extension GL_EXT_shader_image_load_formatted : enable\n";
}
if (info.uses_derivatives && profile.support_gl_derivative_control) {
header += "#extension GL_ARB_derivative_control : enable\n";
}
if (uses_geometry_passthrough) {
header += "#extension GL_NV_geometry_shader_passthrough : enable\n";
}
}
void EmitContext::DefineConstantBuffers(Bindings& bindings) {
if (info.constant_buffer_descriptors.empty()) {
return;
}
for (const auto& desc : info.constant_buffer_descriptors) {
header += fmt::format(
"layout(std140,binding={}) uniform {}_cbuf_{}{{vec4 {}_cbuf{}[{}];}};",
bindings.uniform_buffer, stage_name, desc.index, stage_name, desc.index, 4 * 1024);
bindings.uniform_buffer += desc.count;
}
}
void EmitContext::DefineStorageBuffers(Bindings& bindings) {
if (info.storage_buffers_descriptors.empty()) {
return;
}
u32 index{};
for (const auto& desc : info.storage_buffers_descriptors) {
header += fmt::format("layout(std430,binding={}) buffer {}_ssbo_{}{{uint {}_ssbo{}[];}};",
bindings.storage_buffer, stage_name, bindings.storage_buffer,
stage_name, index);
bindings.storage_buffer += desc.count;
index += desc.count;
}
}
void EmitContext::DefineGenericOutput(size_t index, u32 invocations) {
static constexpr std::string_view swizzle{"xyzw"};
const size_t base_index{static_cast<size_t>(IR::Attribute::Generic0X) + index * 4};
u32 element{0};
while (element < 4) {
std::string definition{fmt::format("layout(location={}", index)};
const u32 remainder{4 - element};
const TransformFeedbackVarying* xfb_varying{};
if (!runtime_info.xfb_varyings.empty()) {
xfb_varying = &runtime_info.xfb_varyings[base_index + element];
xfb_varying = xfb_varying && xfb_varying->components > 0 ? xfb_varying : nullptr;
}
const u32 num_components{xfb_varying ? xfb_varying->components : remainder};
if (element > 0) {
definition += fmt::format(",component={}", element);
}
if (xfb_varying) {
definition +=
fmt::format(",xfb_buffer={},xfb_stride={},xfb_offset={}", xfb_varying->buffer,
xfb_varying->stride, xfb_varying->offset);
}
std::string name{fmt::format("out_attr{}", index)};
if (num_components < 4 || element > 0) {
name += fmt::format("_{}", swizzle.substr(element, num_components));
}
const auto type{num_components == 1 ? "float" : fmt::format("vec{}", num_components)};
definition += fmt::format(")out {} {}{};", type, name, OutputDecorator(stage, invocations));
header += definition;
const GenericElementInfo element_info{
.name = name,
.first_element = element,
.num_components = num_components,
};
std::fill_n(output_generics[index].begin() + element, num_components, element_info);
element += num_components;
}
}
void EmitContext::DefineHelperFunctions() {
header += "\n#define ftoi floatBitsToInt\n#define ftou floatBitsToUint\n"
"#define itof intBitsToFloat\n#define utof uintBitsToFloat\n";
if (info.uses_global_increment || info.uses_shared_increment) {
header += "uint CasIncrement(uint op_a,uint op_b){return op_a>=op_b?0u:(op_a+1u);}";
}
if (info.uses_global_decrement || info.uses_shared_decrement) {
header += "uint CasDecrement(uint op_a,uint op_b){"
"return op_a==0||op_a>op_b?op_b:(op_a-1u);}";
}
if (info.uses_atomic_f32_add) {
header += "uint CasFloatAdd(uint op_a,float op_b){"
"return ftou(utof(op_a)+op_b);}";
}
if (info.uses_atomic_f32x2_add) {
header += "uint CasFloatAdd32x2(uint op_a,vec2 op_b){"
"return packHalf2x16(unpackHalf2x16(op_a)+op_b);}";
}
if (info.uses_atomic_f32x2_min) {
header += "uint CasFloatMin32x2(uint op_a,vec2 op_b){return "
"packHalf2x16(min(unpackHalf2x16(op_a),op_b));}";
}
if (info.uses_atomic_f32x2_max) {
header += "uint CasFloatMax32x2(uint op_a,vec2 op_b){return "
"packHalf2x16(max(unpackHalf2x16(op_a),op_b));}";
}
if (info.uses_atomic_f16x2_add) {
header += "uint CasFloatAdd16x2(uint op_a,f16vec2 op_b){return "
"packFloat2x16(unpackFloat2x16(op_a)+op_b);}";
}
if (info.uses_atomic_f16x2_min) {
header += "uint CasFloatMin16x2(uint op_a,f16vec2 op_b){return "
"packFloat2x16(min(unpackFloat2x16(op_a),op_b));}";
}
if (info.uses_atomic_f16x2_max) {
header += "uint CasFloatMax16x2(uint op_a,f16vec2 op_b){return "
"packFloat2x16(max(unpackFloat2x16(op_a),op_b));}";
}
if (info.uses_atomic_s32_min) {
header += "uint CasMinS32(uint op_a,uint op_b){return uint(min(int(op_a),int(op_b)));}";
}
if (info.uses_atomic_s32_max) {
header += "uint CasMaxS32(uint op_a,uint op_b){return uint(max(int(op_a),int(op_b)));}";
}
if (info.uses_global_memory && profile.support_int64) {
header += DefineGlobalMemoryFunctions();
}
if (info.loads_indexed_attributes) {
const bool is_array{stage == Stage::Geometry};
const auto vertex_arg{is_array ? ",uint vertex" : ""};
std::string func{
fmt::format("float IndexedAttrLoad(int offset{}){{int base_index=offset>>2;uint "
"masked_index=uint(base_index)&3u;switch(base_index>>2){{",
vertex_arg)};
if (info.loads.AnyComponent(IR::Attribute::PositionX)) {
const auto position_idx{is_array ? "gl_in[vertex]." : ""};
func += fmt::format("case {}:return {}{}[masked_index];",
static_cast<u32>(IR::Attribute::PositionX) >> 2, position_idx,
position_name);
}
const u32 base_attribute_value = static_cast<u32>(IR::Attribute::Generic0X) >> 2;
for (u32 index = 0; index < IR::NUM_GENERICS; ++index) {
if (!info.loads.Generic(index)) {
continue;
}
const auto vertex_idx{is_array ? "[vertex]" : ""};
func += fmt::format("case {}:return in_attr{}{}[masked_index];",
base_attribute_value + index, index, vertex_idx);
}
func += "default: return 0.0;}}";
header += func;
}
if (info.stores_indexed_attributes) {
// TODO
}
}
std::string EmitContext::DefineGlobalMemoryFunctions() {
const auto define_body{[&](std::string& func, size_t index, std::string_view return_statement) {
const auto& ssbo{info.storage_buffers_descriptors[index]};
const u32 size_cbuf_offset{ssbo.cbuf_offset + 8};
const auto ssbo_addr{fmt::format("ssbo_addr{}", index)};
const auto cbuf{fmt::format("{}_cbuf{}", stage_name, ssbo.cbuf_index)};
std::array<std::string, 2> addr_xy;
std::array<std::string, 2> size_xy;
for (size_t i = 0; i < addr_xy.size(); ++i) {
const auto addr_loc{ssbo.cbuf_offset + 4 * i};
const auto size_loc{size_cbuf_offset + 4 * i};
addr_xy[i] = fmt::format("ftou({}[{}].{})", cbuf, addr_loc / 16, Swizzle(addr_loc));
size_xy[i] = fmt::format("ftou({}[{}].{})", cbuf, size_loc / 16, Swizzle(size_loc));
}
const auto addr_pack{fmt::format("packUint2x32(uvec2({},{}))", addr_xy[0], addr_xy[1])};
const auto addr_statment{fmt::format("uint64_t {}={};", ssbo_addr, addr_pack)};
func += addr_statment;
const auto size_vec{fmt::format("uvec2({},{})", size_xy[0], size_xy[1])};
const auto comp_lhs{fmt::format("(addr>={})", ssbo_addr)};
const auto comp_rhs{fmt::format("(addr<({}+uint64_t({})))", ssbo_addr, size_vec)};
const auto comparison{fmt::format("if({}&&{}){{", comp_lhs, comp_rhs)};
func += comparison;
const auto ssbo_name{fmt::format("{}_ssbo{}", stage_name, index)};
func += fmt::format(fmt::runtime(return_statement), ssbo_name, ssbo_addr);
}};
std::string write_func{"void WriteGlobal32(uint64_t addr,uint data){"};
std::string write_func_64{"void WriteGlobal64(uint64_t addr,uvec2 data){"};
std::string write_func_128{"void WriteGlobal128(uint64_t addr,uvec4 data){"};
std::string load_func{"uint LoadGlobal32(uint64_t addr){"};
std::string load_func_64{"uvec2 LoadGlobal64(uint64_t addr){"};
std::string load_func_128{"uvec4 LoadGlobal128(uint64_t addr){"};
const size_t num_buffers{info.storage_buffers_descriptors.size()};
for (size_t index = 0; index < num_buffers; ++index) {
if (!info.nvn_buffer_used[index]) {
continue;
}
define_body(write_func, index, "{0}[uint(addr-{1})>>2]=data;return;}}");
define_body(write_func_64, index,
"{0}[uint(addr-{1})>>2]=data.x;{0}[uint(addr-{1}+4)>>2]=data.y;return;}}");
define_body(write_func_128, index,
"{0}[uint(addr-{1})>>2]=data.x;{0}[uint(addr-{1}+4)>>2]=data.y;{0}[uint("
"addr-{1}+8)>>2]=data.z;{0}[uint(addr-{1}+12)>>2]=data.w;return;}}");
define_body(load_func, index, "return {0}[uint(addr-{1})>>2];}}");
define_body(load_func_64, index,
"return uvec2({0}[uint(addr-{1})>>2],{0}[uint(addr-{1}+4)>>2]);}}");
define_body(load_func_128, index,
"return uvec4({0}[uint(addr-{1})>>2],{0}[uint(addr-{1}+4)>>2],{0}["
"uint(addr-{1}+8)>>2],{0}[uint(addr-{1}+12)>>2]);}}");
}
write_func += '}';
write_func_64 += '}';
write_func_128 += '}';
load_func += "return 0u;}";
load_func_64 += "return uvec2(0);}";
load_func_128 += "return uvec4(0);}";
return write_func + write_func_64 + write_func_128 + load_func + load_func_64 + load_func_128;
}
void EmitContext::SetupImages(Bindings& bindings) {
image_buffers.reserve(info.image_buffer_descriptors.size());
for (const auto& desc : info.image_buffer_descriptors) {
image_buffers.push_back({bindings.image, desc.count});
const auto format{ImageFormatString(desc.format)};
const auto qualifier{ImageAccessQualifier(desc.is_written, desc.is_read)};
const auto array_decorator{desc.count > 1 ? fmt::format("[{}]", desc.count) : ""};
header += fmt::format("layout(binding={}{}) uniform {}uimageBuffer img{}{};",
bindings.image, format, qualifier, bindings.image, array_decorator);
bindings.image += desc.count;
}
images.reserve(info.image_descriptors.size());
for (const auto& desc : info.image_descriptors) {
images.push_back({bindings.image, desc.count});
const auto format{ImageFormatString(desc.format)};
const auto image_type{ImageType(desc.type)};
const auto qualifier{ImageAccessQualifier(desc.is_written, desc.is_read)};
const auto array_decorator{desc.count > 1 ? fmt::format("[{}]", desc.count) : ""};
header += fmt::format("layout(binding={}{})uniform {}{} img{}{};", bindings.image, format,
qualifier, image_type, bindings.image, array_decorator);
bindings.image += desc.count;
}
}
void EmitContext::SetupTextures(Bindings& bindings) {
texture_buffers.reserve(info.texture_buffer_descriptors.size());
for (const auto& desc : info.texture_buffer_descriptors) {
texture_buffers.push_back({bindings.texture, desc.count});
const auto sampler_type{SamplerType(TextureType::Buffer, false)};
const auto array_decorator{desc.count > 1 ? fmt::format("[{}]", desc.count) : ""};
header += fmt::format("layout(binding={}) uniform {} tex{}{};", bindings.texture,
sampler_type, bindings.texture, array_decorator);
bindings.texture += desc.count;
}
textures.reserve(info.texture_descriptors.size());
for (const auto& desc : info.texture_descriptors) {
textures.push_back({bindings.texture, desc.count});
const auto sampler_type{SamplerType(desc.type, desc.is_depth)};
const auto array_decorator{desc.count > 1 ? fmt::format("[{}]", desc.count) : ""};
header += fmt::format("layout(binding={}) uniform {} tex{}{};", bindings.texture,
sampler_type, bindings.texture, array_decorator);
bindings.texture += desc.count;
}
}
void EmitContext::DefineConstants() {
if (info.uses_fswzadd) {
header += "const float FSWZ_A[]=float[4](-1.f,1.f,-1.f,0.f);"
"const float FSWZ_B[]=float[4](-1.f,-1.f,1.f,-1.f);";
}
}
} // namespace Shader::Backend::GLSL

View File

@@ -0,0 +1,174 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <string>
#include <utility>
#include <vector>
#include <fmt/format.h>
#include "shader_recompiler/backend/glsl/var_alloc.h"
#include "shader_recompiler/stage.h"
namespace Shader {
struct Info;
struct Profile;
struct RuntimeInfo;
} // namespace Shader
namespace Shader::Backend {
struct Bindings;
}
namespace Shader::IR {
class Inst;
struct Program;
} // namespace Shader::IR
namespace Shader::Backend::GLSL {
struct GenericElementInfo {
std::string name;
u32 first_element{};
u32 num_components{};
};
struct TextureImageDefinition {
u32 binding;
u32 count;
};
class EmitContext {
public:
explicit EmitContext(IR::Program& program, Bindings& bindings, const Profile& profile_,
const RuntimeInfo& runtime_info_);
template <GlslVarType type, typename... Args>
void Add(const char* format_str, IR::Inst& inst, Args&&... args) {
const auto var_def{var_alloc.AddDefine(inst, type)};
if (var_def.empty()) {
// skip assigment.
code += fmt::format(fmt::runtime(format_str + 3), std::forward<Args>(args)...);
} else {
code += fmt::format(fmt::runtime(format_str), var_def, std::forward<Args>(args)...);
}
// TODO: Remove this
code += '\n';
}
template <typename... Args>
void AddU1(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::U1>(format_str, inst, args...);
}
template <typename... Args>
void AddF16x2(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::F16x2>(format_str, inst, args...);
}
template <typename... Args>
void AddU32(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::U32>(format_str, inst, args...);
}
template <typename... Args>
void AddF32(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::F32>(format_str, inst, args...);
}
template <typename... Args>
void AddU64(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::U64>(format_str, inst, args...);
}
template <typename... Args>
void AddF64(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::F64>(format_str, inst, args...);
}
template <typename... Args>
void AddU32x2(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::U32x2>(format_str, inst, args...);
}
template <typename... Args>
void AddF32x2(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::F32x2>(format_str, inst, args...);
}
template <typename... Args>
void AddU32x3(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::U32x3>(format_str, inst, args...);
}
template <typename... Args>
void AddF32x3(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::F32x3>(format_str, inst, args...);
}
template <typename... Args>
void AddU32x4(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::U32x4>(format_str, inst, args...);
}
template <typename... Args>
void AddF32x4(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::F32x4>(format_str, inst, args...);
}
template <typename... Args>
void AddPrecF32(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::PrecF32>(format_str, inst, args...);
}
template <typename... Args>
void AddPrecF64(const char* format_str, IR::Inst& inst, Args&&... args) {
Add<GlslVarType::PrecF64>(format_str, inst, args...);
}
template <typename... Args>
void Add(const char* format_str, Args&&... args) {
code += fmt::format(fmt::runtime(format_str), std::forward<Args>(args)...);
// TODO: Remove this
code += '\n';
}
std::string header;
std::string code;
VarAlloc var_alloc;
const Info& info;
const Profile& profile;
const RuntimeInfo& runtime_info;
Stage stage{};
std::string_view stage_name = "invalid";
std::string_view position_name = "gl_Position";
std::vector<TextureImageDefinition> texture_buffers;
std::vector<TextureImageDefinition> image_buffers;
std::vector<TextureImageDefinition> textures;
std::vector<TextureImageDefinition> images;
std::array<std::array<GenericElementInfo, 4>, 32> output_generics{};
u32 num_safety_loop_vars{};
bool uses_y_direction{};
bool uses_cc_carry{};
bool uses_geometry_passthrough{};
private:
void SetupExtensions();
void DefineConstantBuffers(Bindings& bindings);
void DefineStorageBuffers(Bindings& bindings);
void DefineGenericOutput(size_t index, u32 invocations);
void DefineHelperFunctions();
void DefineConstants();
std::string DefineGlobalMemoryFunctions();
void SetupImages(Bindings& bindings);
void SetupTextures(Bindings& bindings);
};
} // namespace Shader::Backend::GLSL

View File

@@ -0,0 +1,252 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <algorithm>
#include <string>
#include <tuple>
#include <type_traits>
#include "common/div_ceil.h"
#include "common/settings.h"
#include "shader_recompiler/backend/glsl/emit_context.h"
#include "shader_recompiler/backend/glsl/emit_glsl.h"
#include "shader_recompiler/backend/glsl/emit_glsl_instructions.h"
#include "shader_recompiler/frontend/ir/ir_emitter.h"
namespace Shader::Backend::GLSL {
namespace {
template <class Func>
struct FuncTraits {};
template <class ReturnType_, class... Args>
struct FuncTraits<ReturnType_ (*)(Args...)> {
using ReturnType = ReturnType_;
static constexpr size_t NUM_ARGS = sizeof...(Args);
template <size_t I>
using ArgType = std::tuple_element_t<I, std::tuple<Args...>>;
};
template <auto func, typename... Args>
void SetDefinition(EmitContext& ctx, IR::Inst* inst, Args... args) {
inst->SetDefinition<Id>(func(ctx, std::forward<Args>(args)...));
}
template <typename ArgType>
auto Arg(EmitContext& ctx, const IR::Value& arg) {
if constexpr (std::is_same_v<ArgType, std::string_view>) {
return ctx.var_alloc.Consume(arg);
} else if constexpr (std::is_same_v<ArgType, const IR::Value&>) {
return arg;
} else if constexpr (std::is_same_v<ArgType, u32>) {
return arg.U32();
} else if constexpr (std::is_same_v<ArgType, IR::Attribute>) {
return arg.Attribute();
} else if constexpr (std::is_same_v<ArgType, IR::Patch>) {
return arg.Patch();
} else if constexpr (std::is_same_v<ArgType, IR::Reg>) {
return arg.Reg();
}
}
template <auto func, bool is_first_arg_inst, size_t... I>
void Invoke(EmitContext& ctx, IR::Inst* inst, std::index_sequence<I...>) {
using Traits = FuncTraits<decltype(func)>;
if constexpr (std::is_same_v<typename Traits::ReturnType, Id>) {
if constexpr (is_first_arg_inst) {
SetDefinition<func>(
ctx, inst, *inst,
Arg<typename Traits::template ArgType<I + 2>>(ctx, inst->Arg(I))...);
} else {
SetDefinition<func>(
ctx, inst, Arg<typename Traits::template ArgType<I + 1>>(ctx, inst->Arg(I))...);
}
} else {
if constexpr (is_first_arg_inst) {
func(ctx, *inst, Arg<typename Traits::template ArgType<I + 2>>(ctx, inst->Arg(I))...);
} else {
func(ctx, Arg<typename Traits::template ArgType<I + 1>>(ctx, inst->Arg(I))...);
}
}
}
template <auto func>
void Invoke(EmitContext& ctx, IR::Inst* inst) {
using Traits = FuncTraits<decltype(func)>;
static_assert(Traits::NUM_ARGS >= 1, "Insufficient arguments");
if constexpr (Traits::NUM_ARGS == 1) {
Invoke<func, false>(ctx, inst, std::make_index_sequence<0>{});
} else {
using FirstArgType = typename Traits::template ArgType<1>;
static constexpr bool is_first_arg_inst = std::is_same_v<FirstArgType, IR::Inst&>;
using Indices = std::make_index_sequence<Traits::NUM_ARGS - (is_first_arg_inst ? 2 : 1)>;
Invoke<func, is_first_arg_inst>(ctx, inst, Indices{});
}
}
void EmitInst(EmitContext& ctx, IR::Inst* inst) {
switch (inst->GetOpcode()) {
#define OPCODE(name, result_type, ...) \
case IR::Opcode::name: \
return Invoke<&Emit##name>(ctx, inst);
#include "shader_recompiler/frontend/ir/opcodes.inc"
#undef OPCODE
}
throw LogicError("Invalid opcode {}", inst->GetOpcode());
}
bool IsReference(IR::Inst& inst) {
return inst.GetOpcode() == IR::Opcode::Reference;
}
void PrecolorInst(IR::Inst& phi) {
// Insert phi moves before references to avoid overwritting other phis
const size_t num_args{phi.NumArgs()};
for (size_t i = 0; i < num_args; ++i) {
IR::Block& phi_block{*phi.PhiBlock(i)};
auto it{std::find_if_not(phi_block.rbegin(), phi_block.rend(), IsReference).base()};
IR::IREmitter ir{phi_block, it};
const IR::Value arg{phi.Arg(i)};
if (arg.IsImmediate()) {
ir.PhiMove(phi, arg);
} else {
ir.PhiMove(phi, IR::Value{arg.InstRecursive()});
}
}
for (size_t i = 0; i < num_args; ++i) {
IR::IREmitter{*phi.PhiBlock(i)}.Reference(IR::Value{&phi});
}
}
void Precolor(const IR::Program& program) {
for (IR::Block* const block : program.blocks) {
for (IR::Inst& phi : block->Instructions()) {
if (!IR::IsPhi(phi)) {
break;
}
PrecolorInst(phi);
}
}
}
void EmitCode(EmitContext& ctx, const IR::Program& program) {
for (const IR::AbstractSyntaxNode& node : program.syntax_list) {
switch (node.type) {
case IR::AbstractSyntaxNode::Type::Block:
for (IR::Inst& inst : node.data.block->Instructions()) {
EmitInst(ctx, &inst);
}
break;
case IR::AbstractSyntaxNode::Type::If:
ctx.Add("if({}){{", ctx.var_alloc.Consume(node.data.if_node.cond));
break;
case IR::AbstractSyntaxNode::Type::EndIf:
ctx.Add("}}");
break;
case IR::AbstractSyntaxNode::Type::Break:
if (node.data.break_node.cond.IsImmediate()) {
if (node.data.break_node.cond.U1()) {
ctx.Add("break;");
}
} else {
ctx.Add("if({}){{break;}}", ctx.var_alloc.Consume(node.data.break_node.cond));
}
break;
case IR::AbstractSyntaxNode::Type::Return:
case IR::AbstractSyntaxNode::Type::Unreachable:
ctx.Add("return;");
break;
case IR::AbstractSyntaxNode::Type::Loop:
ctx.Add("for(;;){{");
break;
case IR::AbstractSyntaxNode::Type::Repeat:
if (Settings::values.disable_shader_loop_safety_checks) {
ctx.Add("if(!{}){{break;}}}}", ctx.var_alloc.Consume(node.data.repeat.cond));
} else {
ctx.Add("if(--loop{}<0 || !{}){{break;}}}}", ctx.num_safety_loop_vars++,
ctx.var_alloc.Consume(node.data.repeat.cond));
}
break;
default:
throw NotImplementedException("AbstractSyntaxNode Type {}", node.type);
}
}
}
std::string GlslVersionSpecifier(const EmitContext& ctx) {
if (ctx.uses_y_direction || ctx.info.stores.Legacy() || ctx.info.loads.Legacy()) {
return " compatibility";
}
return "";
}
bool IsPreciseType(GlslVarType type) {
switch (type) {
case GlslVarType::PrecF32:
case GlslVarType::PrecF64:
return true;
default:
return false;
}
}
void DefineVariables(const EmitContext& ctx, std::string& header) {
for (u32 i = 0; i < static_cast<u32>(GlslVarType::Void); ++i) {
const auto type{static_cast<GlslVarType>(i)};
const auto& tracker{ctx.var_alloc.GetUseTracker(type)};
const auto type_name{ctx.var_alloc.GetGlslType(type)};
const bool has_precise_bug{ctx.stage == Stage::Fragment && ctx.profile.has_gl_precise_bug};
const auto precise{!has_precise_bug && IsPreciseType(type) ? "precise " : ""};
// Temps/return types that are never used are stored at index 0
if (tracker.uses_temp) {
header += fmt::format("{}{} t{}={}(0);", precise, type_name,
ctx.var_alloc.Representation(0, type), type_name);
}
for (u32 index = 0; index < tracker.num_used; ++index) {
header += fmt::format("{}{} {}={}(0);", precise, type_name,
ctx.var_alloc.Representation(index, type), type_name);
}
}
for (u32 i = 0; i < ctx.num_safety_loop_vars; ++i) {
header += fmt::format("int loop{}=0x2000;", i);
}
}
} // Anonymous namespace
std::string EmitGLSL(const Profile& profile, const RuntimeInfo& runtime_info, IR::Program& program,
Bindings& bindings) {
EmitContext ctx{program, bindings, profile, runtime_info};
Precolor(program);
EmitCode(ctx, program);
const std::string version{fmt::format("#version 450{}\n", GlslVersionSpecifier(ctx))};
ctx.header.insert(0, version);
if (program.shared_memory_size > 0) {
const auto requested_size{program.shared_memory_size};
const auto max_size{profile.gl_max_compute_smem_size};
const bool needs_clamp{requested_size > max_size};
if (needs_clamp) {
LOG_WARNING(Shader_GLSL, "Requested shared memory size ({}) exceeds device limit ({})",
requested_size, max_size);
}
const auto smem_size{needs_clamp ? max_size : requested_size};
ctx.header += fmt::format("shared uint smem[{}];", Common::DivCeil(smem_size, 4U));
}
ctx.header += "void main(){\n";
if (program.local_memory_size > 0) {
ctx.header += fmt::format("uint lmem[{}];", Common::DivCeil(program.local_memory_size, 4U));
}
DefineVariables(ctx, ctx.header);
if (ctx.uses_cc_carry) {
ctx.header += "uint carry;";
}
if (program.info.uses_subgroup_shuffles) {
ctx.header += "bool shfl_in_bounds;";
}
ctx.code.insert(0, ctx.header);
ctx.code += '}';
return ctx.code;
}
} // namespace Shader::Backend::GLSL

View File

@@ -0,0 +1,24 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <string>
#include "shader_recompiler/backend/bindings.h"
#include "shader_recompiler/frontend/ir/program.h"
#include "shader_recompiler/profile.h"
#include "shader_recompiler/runtime_info.h"
namespace Shader::Backend::GLSL {
[[nodiscard]] std::string EmitGLSL(const Profile& profile, const RuntimeInfo& runtime_info,
IR::Program& program, Bindings& bindings);
[[nodiscard]] inline std::string EmitGLSL(const Profile& profile, IR::Program& program) {
Bindings binding;
return EmitGLSL(profile, {}, program, binding);
}
} // namespace Shader::Backend::GLSL

View File

@@ -0,0 +1,418 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <string_view>
#include "shader_recompiler/backend/glsl/emit_context.h"
#include "shader_recompiler/backend/glsl/emit_glsl_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLSL {
namespace {
constexpr char cas_loop[]{
"for (;;){{uint old={};{}=atomicCompSwap({},old,{}({},{}));if({}==old){{break;}}}}"};
void SharedCasFunction(EmitContext& ctx, IR::Inst& inst, std::string_view offset,
std::string_view value, std::string_view function) {
const auto ret{ctx.var_alloc.Define(inst, GlslVarType::U32)};
const std::string smem{fmt::format("smem[{}>>2]", offset)};
ctx.Add(cas_loop, smem, ret, smem, function, smem, value, ret);
}
void SsboCasFunction(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value, std::string_view function) {
const auto ret{ctx.var_alloc.Define(inst, GlslVarType::U32)};
const std::string ssbo{fmt::format("{}_ssbo{}[{}>>2]", ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset))};
ctx.Add(cas_loop, ssbo, ret, ssbo, function, ssbo, value, ret);
}
void SsboCasFunctionF32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value,
std::string_view function) {
const std::string ssbo{fmt::format("{}_ssbo{}[{}>>2]", ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset))};
const auto ret{ctx.var_alloc.Define(inst, GlslVarType::U32)};
ctx.Add(cas_loop, ssbo, ret, ssbo, function, ssbo, value, ret);
ctx.AddF32("{}=utof({});", inst, ret);
}
} // Anonymous namespace
void EmitSharedAtomicIAdd32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
ctx.AddU32("{}=atomicAdd(smem[{}>>2],{});", inst, pointer_offset, value);
}
void EmitSharedAtomicSMin32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
const std::string u32_value{fmt::format("uint({})", value)};
SharedCasFunction(ctx, inst, pointer_offset, u32_value, "CasMinS32");
}
void EmitSharedAtomicUMin32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
ctx.AddU32("{}=atomicMin(smem[{}>>2],{});", inst, pointer_offset, value);
}
void EmitSharedAtomicSMax32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
const std::string u32_value{fmt::format("uint({})", value)};
SharedCasFunction(ctx, inst, pointer_offset, u32_value, "CasMaxS32");
}
void EmitSharedAtomicUMax32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
ctx.AddU32("{}=atomicMax(smem[{}>>2],{});", inst, pointer_offset, value);
}
void EmitSharedAtomicInc32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
SharedCasFunction(ctx, inst, pointer_offset, value, "CasIncrement");
}
void EmitSharedAtomicDec32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
SharedCasFunction(ctx, inst, pointer_offset, value, "CasDecrement");
}
void EmitSharedAtomicAnd32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
ctx.AddU32("{}=atomicAnd(smem[{}>>2],{});", inst, pointer_offset, value);
}
void EmitSharedAtomicOr32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
ctx.AddU32("{}=atomicOr(smem[{}>>2],{});", inst, pointer_offset, value);
}
void EmitSharedAtomicXor32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
ctx.AddU32("{}=atomicXor(smem[{}>>2],{});", inst, pointer_offset, value);
}
void EmitSharedAtomicExchange32(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
ctx.AddU32("{}=atomicExchange(smem[{}>>2],{});", inst, pointer_offset, value);
}
void EmitSharedAtomicExchange64(EmitContext& ctx, IR::Inst& inst, std::string_view pointer_offset,
std::string_view value) {
LOG_WARNING(Shader_GLSL, "Int64 atomics not supported, fallback to non-atomic");
ctx.AddU64("{}=packUint2x32(uvec2(smem[{}>>2],smem[({}+4)>>2]));", inst, pointer_offset,
pointer_offset);
ctx.Add("smem[{}>>2]=unpackUint2x32({}).x;smem[({}+4)>>2]=unpackUint2x32({}).y;",
pointer_offset, value, pointer_offset, value);
}
void EmitStorageAtomicIAdd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU32("{}=atomicAdd({}_ssbo{}[{}>>2],{});", inst, ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicSMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
const std::string u32_value{fmt::format("uint({})", value)};
SsboCasFunction(ctx, inst, binding, offset, u32_value, "CasMinS32");
}
void EmitStorageAtomicUMin32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU32("{}=atomicMin({}_ssbo{}[{}>>2],{});", inst, ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicSMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
const std::string u32_value{fmt::format("uint({})", value)};
SsboCasFunction(ctx, inst, binding, offset, u32_value, "CasMaxS32");
}
void EmitStorageAtomicUMax32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU32("{}=atomicMax({}_ssbo{}[{}>>2],{});", inst, ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicInc32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunction(ctx, inst, binding, offset, value, "CasIncrement");
}
void EmitStorageAtomicDec32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunction(ctx, inst, binding, offset, value, "CasDecrement");
}
void EmitStorageAtomicAnd32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU32("{}=atomicAnd({}_ssbo{}[{}>>2],{});", inst, ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicOr32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU32("{}=atomicOr({}_ssbo{}[{}>>2],{});", inst, ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicXor32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU32("{}=atomicXor({}_ssbo{}[{}>>2],{});", inst, ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicExchange32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU32("{}=atomicExchange({}_ssbo{}[{}>>2],{});", inst, ctx.stage_name, binding.U32(),
ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicIAdd64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
LOG_WARNING(Shader_GLSL, "Int64 atomics not supported, fallback to non-atomic");
ctx.AddU64("{}=packUint2x32(uvec2({}_ssbo{}[{}>>2],{}_ssbo{}[({}>>2)+1]));", inst,
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset));
ctx.Add("{}_ssbo{}[{}>>2]+=unpackUint2x32({}).x;{}_ssbo{}[({}>>2)+1]+=unpackUint2x32({}).y;",
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), value, ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicSMin64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
LOG_WARNING(Shader_GLSL, "Int64 atomics not supported, fallback to non-atomic");
ctx.AddU64("{}=packInt2x32(ivec2({}_ssbo{}[{}>>2],{}_ssbo{}[({}>>2)+1]));", inst,
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset));
ctx.Add("for(int i=0;i<2;++i){{ "
"{}_ssbo{}[({}>>2)+i]=uint(min(int({}_ssbo{}[({}>>2)+i]),unpackInt2x32(int64_t({}))[i])"
");}}",
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicUMin64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
LOG_WARNING(Shader_GLSL, "Int64 atomics not supported, fallback to non-atomic");
ctx.AddU64("{}=packUint2x32(uvec2({}_ssbo{}[{}>>2],{}_ssbo{}[({}>>2)+1]));", inst,
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset));
ctx.Add("for(int i=0;i<2;++i){{ "
"{}_ssbo{}[({}>>2)+i]=min({}_ssbo{}[({}>>2)+i],unpackUint2x32(uint64_t({}))[i]);}}",
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicSMax64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
LOG_WARNING(Shader_GLSL, "Int64 atomics not supported, fallback to non-atomic");
ctx.AddU64("{}=packInt2x32(ivec2({}_ssbo{}[{}>>2],{}_ssbo{}[({}>>2)+1]));", inst,
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset));
ctx.Add("for(int i=0;i<2;++i){{ "
"{}_ssbo{}[({}>>2)+i]=uint(max(int({}_ssbo{}[({}>>2)+i]),unpackInt2x32(int64_t({}))[i])"
");}}",
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicUMax64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
LOG_WARNING(Shader_GLSL, "Int64 atomics not supported, fallback to non-atomic");
ctx.AddU64("{}=packUint2x32(uvec2({}_ssbo{}[{}>>2],{}_ssbo{}[({}>>2)+1]));", inst,
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset));
ctx.Add("for(int "
"i=0;i<2;++i){{{}_ssbo{}[({}>>2)+i]=max({}_ssbo{}[({}>>2)+i],unpackUint2x32(uint64_t({}"
"))[i]);}}",
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicAnd64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU64(
"{}=packUint2x32(uvec2(atomicAnd({}_ssbo{}[{}>>2],unpackUint2x32({}).x),atomicAnd({}_"
"ssbo{}[({}>>2)+1],unpackUint2x32({}).y)));",
inst, ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), value, ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicOr64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU64("{}=packUint2x32(uvec2(atomicOr({}_ssbo{}[{}>>2],unpackUint2x32({}).x),atomicOr({}_"
"ssbo{}[({}>>2)+1],unpackUint2x32({}).y)));",
inst, ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), value,
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicXor64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU64(
"{}=packUint2x32(uvec2(atomicXor({}_ssbo{}[{}>>2],unpackUint2x32({}).x),atomicXor({}_"
"ssbo{}[({}>>2)+1],unpackUint2x32({}).y)));",
inst, ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), value, ctx.stage_name,
binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicExchange64(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
ctx.AddU64("{}=packUint2x32(uvec2(atomicExchange({}_ssbo{}[{}>>2],unpackUint2x32({}).x),"
"atomicExchange({}_ssbo{}[({}>>2)+1],unpackUint2x32({}).y)));",
inst, ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), value,
ctx.stage_name, binding.U32(), ctx.var_alloc.Consume(offset), value);
}
void EmitStorageAtomicAddF32(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunctionF32(ctx, inst, binding, offset, value, "CasFloatAdd");
}
void EmitStorageAtomicAddF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunction(ctx, inst, binding, offset, value, "CasFloatAdd16x2");
}
void EmitStorageAtomicAddF32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunction(ctx, inst, binding, offset, value, "CasFloatAdd32x2");
}
void EmitStorageAtomicMinF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunction(ctx, inst, binding, offset, value, "CasFloatMin16x2");
}
void EmitStorageAtomicMinF32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunction(ctx, inst, binding, offset, value, "CasFloatMin32x2");
}
void EmitStorageAtomicMaxF16x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunction(ctx, inst, binding, offset, value, "CasFloatMax16x2");
}
void EmitStorageAtomicMaxF32x2(EmitContext& ctx, IR::Inst& inst, const IR::Value& binding,
const IR::Value& offset, std::string_view value) {
SsboCasFunction(ctx, inst, binding, offset, value, "CasFloatMax32x2");
}
void EmitGlobalAtomicIAdd32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicSMin32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicUMin32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicSMax32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicUMax32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicInc32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicDec32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicAnd32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicOr32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicXor32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicExchange32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicIAdd64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicSMin64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicUMin64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicSMax64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicUMax64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicInc64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicDec64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicAnd64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicOr64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicXor64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicExchange64(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicAddF32(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicAddF16x2(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicAddF32x2(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicMinF16x2(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicMinF32x2(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicMaxF16x2(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
void EmitGlobalAtomicMaxF32x2(EmitContext&) {
throw NotImplementedException("GLSL Instrucion");
}
} // namespace Shader::Backend::GLSL

View File

@@ -0,0 +1,21 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "shader_recompiler/backend/glsl/emit_context.h"
#include "shader_recompiler/backend/glsl/emit_glsl_instructions.h"
#include "shader_recompiler/frontend/ir/value.h"
namespace Shader::Backend::GLSL {
void EmitBarrier(EmitContext& ctx) {
ctx.Add("barrier();");
}
void EmitWorkgroupMemoryBarrier(EmitContext& ctx) {
ctx.Add("groupMemoryBarrier();");
}
void EmitDeviceMemoryBarrier(EmitContext& ctx) {
ctx.Add("memoryBarrier();");
}
} // namespace Shader::Backend::GLSL

Some files were not shown because too many files have changed in this diff Show More