Compare commits

...

89 Commits

Author SHA1 Message Date
Subv
4c59105adf GPU: Implement offsetted rendering when using non-indexed drawing. 2018-07-02 11:23:36 -05:00
Subv
fca3d1cc65 GPU: Fixed the index offset rendering, and implemented the base vertex functionality.
This fixes Stardew Valley.
2018-07-02 11:22:17 -05:00
Subv
cc73bad293 GPU: Added register definitions for the vertex buffer base element. 2018-07-02 11:21:23 -05:00
bunnei
066d6184d4 Merge pull request #602 from Subv/mufu_subop
GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation.
2018-07-01 11:06:04 -04:00
bunnei
b611d852db Merge pull request #601 from Subv/rgba32_ui
GPU: Implement the RGBA32_UINT rendertarget format.
2018-07-01 03:22:38 -04:00
bunnei
85a60e2044 Merge pull request #600 from bunnei/pred-not-eq-nan
gl_shader_decompiler: Implement predicate NotEqualWithNan.
2018-07-01 03:22:11 -04:00
Subv
f33e406ff2 GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation. 2018-06-30 14:48:25 -05:00
Subv
c0e2d52758 GPU: Implemented the RGBA32_UINT rendertarget format. 2018-06-30 14:23:13 -05:00
Subv
b11072d54a GLCache: Specify the component type along the texture type in the format tuple. 2018-06-30 14:08:51 -05:00
bunnei
c96da97630 gl_shader_decompiler: Implement predicate NotEqualWithNan. 2018-06-30 03:01:25 -04:00
bunnei
50ef2beb58 Merge pull request #595 from bunnei/raster-cache
Rewrite the OpenGL rasterizer cache
2018-06-29 14:07:28 -04:00
bunnei
c18425ef98 gl_rasterizer_cache: Only dereference color_surface/depth_surface if valid. 2018-06-29 13:08:08 -04:00
bunnei
da2bdbc0d7 Merge pull request #588 from mailwl/hwopus
Service/Audio: add hwopus service, stub GetWorkBufferSize function
2018-06-27 21:57:21 -04:00
bunnei
7fa9177830 gl_shader_decompiler: Add a return path for unknown instructions. 2018-06-27 01:14:34 -04:00
bunnei
1dd754590f gl_rasterizer_cache: Implement caching for texture and framebuffer surfaces.
gl_rasterizer_cache: Improved cache management based on Citra's implementation.

gl_surface_cache: Add some docstrings.
2018-06-27 00:15:44 -04:00
bunnei
8af1ae46aa gl_rasterizer_cache: Various fixes for ASTC handling. 2018-06-27 00:08:04 -04:00
bunnei
c7c379bd19 gl_rasterizer_cache: Use SurfaceParams as a key for surface caching. 2018-06-27 00:08:04 -04:00
bunnei
6a28a66832 maxwell_3d: Add a struct for RenderTargetConfig. 2018-06-27 00:08:04 -04:00
bunnei
1bbbd26563 settings: Add a configuration for use_accurate_framebuffers. 2018-06-27 00:08:04 -04:00
bunnei
3f9f047375 gl_rasterizer: Implement AccelerateDisplay to forward textures to framebuffers. 2018-06-27 00:08:03 -04:00
bunnei
ff6785f3e8 gl_rasterizer_cache: Cache size_in_bytes as a const per surface. 2018-06-27 00:08:03 -04:00
bunnei
9f2f819bb6 gl_rasterizer_cache: Refactor to make SurfaceParams members const. 2018-06-27 00:08:03 -04:00
bunnei
5f57ab1b2a gl_rasterizer_cache: Remove Citra's rasterizer cache, always load/flush surfaces. 2018-06-27 00:08:03 -04:00
bunnei
84cadf9918 Merge pull request #594 from bunnei/max-constbuff
gl_rasterizer: Workaround for when exceeding max UBO size.
2018-06-27 00:06:23 -04:00
bunnei
10422f3c18 gl_rasterizer: Workaround for when exceeding max UBO size. 2018-06-26 23:07:34 -04:00
bunnei
dfac394e60 Merge pull request #593 from bunnei/fix-swizzle
gl_state: Fix state management for texture swizzle.
2018-06-26 22:05:49 -04:00
bunnei
73de9bab1a Merge pull request #592 from bunnei/cleanup-gl-state
gl_state: Remove unused state management from 3DS.
2018-06-26 22:05:03 -04:00
bunnei
0399d98cd9 Merge pull request #591 from bunnei/fix-rgb565
gl_rasterizer_cache: Fix inverted B5G6R5 format.
2018-06-26 22:04:42 -04:00
bunnei
8447d20a11 gl_state: Fix state management for texture swizzle. 2018-06-26 17:15:58 -04:00
bunnei
20b58bab9c gl_state: Remove unused state management from 3DS. 2018-06-26 17:09:25 -04:00
bunnei
41b3725d28 gl_rasterizer_cache: Fix inverted B5G6R5 format. 2018-06-26 17:07:36 -04:00
bunnei
2981408722 Merge pull request #590 from bunnei/rm-ssbo-check
yuzu: Remove SSBOs check from Qt frontend.
2018-06-26 14:28:56 -04:00
bunnei
1669911b1d yuzu: Remove SSBOs check from Qt frontend. 2018-06-26 11:28:56 -04:00
bunnei
36dedae842 Merge pull request #554 from Subv/constbuffer_ubo
Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.
2018-06-26 10:25:56 -04:00
bunnei
1da0ee57fd Merge pull request #589 from mailwl/fix-crash
Fix crash at exit
2018-06-26 01:01:10 -04:00
mailwl
ad39bab271 Fix crash at exit 2018-06-25 18:01:08 +03:00
David
c9e821e93e Send the correct RequestUpdateAudioRenderer revision in the output header (#587)
* We should be returning our revision instead of what is requested.

Hardware test on a 5.1.0 console

* Added sysversion comment
2018-06-25 10:34:41 -04:00
mailwl
11fb17054e Service/Audio: add hwopus service, stub GetWorkBufferSize function 2018-06-25 16:44:17 +03:00
David
838724c588 Removed duplicate structs, changed AudioRendererResponse -> UpdateDataHeader (#583)
* Removed duplicate structs, changed AudioRendererResponse -> UpdateDataHeader

According to game symbols(SMO), there's references to UpdateDataHeader which seems to be what AudioRendererResponse actually is

* oops

* AudioRendererParameters should be AudioRendererParameter according to SMO
2018-06-23 20:46:29 -04:00
bunnei
0b831dd2ba Revert "Use Ninja for MSVC AppVeyor builds" (#584) 2018-06-23 03:17:32 -04:00
David
81f24f5685 Fixed RequestUpdateAudioRenderer deadlocks and calculated section sizes properly (#580)
* Fixed RequestUpdateAudioRenderer deadlocks and calculated section sizes properly

This fixes RequestUpdateAudioRenderer deadlocks in games like Puyo Puyo Tetris and games which require a proper section size in games such as Retro City Rampage. This fixes causes various games to start rendering or trying to render
2018-06-22 22:22:33 -04:00
bunnei
ea1880f47c Merge pull request #526 from janisozaur/appveyor-ninja
Use Ninja for MSVC AppVeyor builds
2018-06-22 14:28:26 -04:00
bunnei
6d7941042b Merge pull request #579 from SciresM/master
svc: Fully implement svcSignalToAddress and svcWaitForAddress
2018-06-22 12:08:39 -04:00
bunnei
52a78228dd Merge pull request #581 from mailwl/empty-buf-skip
IPC: skip empty buffer write
2018-06-22 10:26:09 -04:00
mailwl
a27befe456 IPC: skip empty buffer write
prevent yuzu crash, if games, like Axiom Verge, trying to read 0 bytes from file
2018-06-22 11:28:10 +03:00
Michael Scire
067ac434ba Kernel/Arbiters: Fix casts, cleanup comments/magic numbers 2018-06-22 00:47:59 -06:00
Michael Scire
5f8aa02584 Add additional missing format. 2018-06-21 21:09:51 -06:00
Michael Scire
08d454e30d Run clang-format on PR. 2018-06-21 21:05:34 -06:00
bunnei
b7162c32a4 Merge pull request #577 from mailwl/audren-update
Service/Audio: update audren:u service
2018-06-21 22:40:37 -04:00
Michael Scire
dc70a87af1 Kernel/Arbiters: HLE is atomic, adjust code to reflect that. 2018-06-21 20:25:57 -06:00
Zach Hilman
63f26d5c40 Add support for decrypted NCA files (#567)
* Start to add NCA support in loader

* More nca stuff

* More changes to nca.cpp

* Now identifies decrypted NCA cont.

* Game list fixes and more structs and stuff

* More updates to Nca class

* Now reads ExeFs (i think)

* ACTUALLY LOADS EXEFS!

* RomFS loads and games execute

* Cleanup and Finalize

* plumbing, cleanup and testing

* fix some things that i didnt think of before

* Preliminary Review Changes

* Review changes for bunnei and subv
2018-06-21 11:16:23 -04:00
Michael Scire
8f8fe62a19 Kernel/Arbiters: Initialize arb_wait_address in thread struct. 2018-06-21 05:13:06 -06:00
Michael Scire
62bd1299ea Kernel/Arbiters: Clear WaitAddress in SignalToAddress 2018-06-21 04:20:39 -06:00
Michael Scire
4f81bc4e1b Kernel/Arbiters: Mostly implement SignalToAddress 2018-06-21 04:10:11 -06:00
Michael Scire
9d71ce88ce Kernel/Arbiters: Implement WaitForAddress 2018-06-21 01:40:29 -06:00
mailwl
c06d6b27f3 Service/Audio: update audren:u service 2018-06-21 10:26:24 +03:00
Michael Scire
7e191dccc1 Kernel/Arbiters: Add stubs for 4.x SignalToAddress/WaitForAddres SVCs. 2018-06-21 00:49:43 -06:00
bunnei
c3e95086b6 Merge pull request #576 from Subv/warnings1
Build: Fixed some MSVC warnings in various parts of the code.
2018-06-20 16:46:14 -04:00
Subv
a3d82ef5d9 Build: Fixed some MSVC warnings in various parts of the code. 2018-06-20 11:39:10 -05:00
greggameplayer
be1f5dedfb Implement GetAvailableLanguageCodes2 (#575)
* Implement GetAvailableLanguageCodes2

* Revert "Implement GetAvailableLanguageCodes2"

This reverts commit caadd9eea3.

* Implement GetAvailableLanguageCodes2

* Implement GetAvailableLanguageCodes2
2018-06-19 11:29:04 -04:00
bunnei
7a0bb406d5 Merge pull request #574 from Subv/shader_abs_neg
GPU: Perform negation after absolute value in the float shader instructions.
2018-06-18 22:24:57 -04:00
bunnei
0d8ae773f1 Merge pull request #561 from DarkLordZach/fix-odyssey-input-crash
Avoid initializing single-joycon layouts with handheld controller
2018-06-18 22:06:11 -04:00
bunnei
1ab133d7fa Merge pull request #573 from Subv/shader_imm
GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.
2018-06-18 21:52:56 -04:00
Subv
38989bef43 GPU: Perform negation after absolute value in the float shader instructions. 2018-06-18 19:56:29 -05:00
Subv
eab7457c00 GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.
Like the MOV32I and FMUL32I instructions.
This fixes a potential crash when using these instructions.
2018-06-18 19:50:35 -05:00
bunnei
0e13d9cb7b Merge pull request #570 from bunnei/astc
gl_rasterizer: Implement texture format ASTC_2D_4X4.
2018-06-18 19:08:49 -04:00
bunnei
c11cfaa705 Merge pull request #562 from DarkLordZach/extracted-ncas-ui
Add UI support for extracted NCA folders
2018-06-18 16:09:46 -04:00
bunnei
4ac4b308e4 Merge pull request #572 from Armada651/user-except-stub
svc: Add a stub for UserExceptionContextAddr.
2018-06-18 11:37:13 -04:00
bunnei
ea080501fb Merge pull request #571 from Armada651/loose-blend
gl_rasterizer: Get loose on independent blending.
2018-06-18 11:36:50 -04:00
Jules Blok
bf4e2b2f0b svc: Add a stub for UserExceptionContextAddr. 2018-06-18 09:29:11 +02:00
Jules Blok
7c7f4a9be2 gl_rasterizer: Get loose on independent blending. 2018-06-18 09:27:06 +02:00
bunnei
61779fa072 gl_rasterizer: Implement texture format ASTC_2D_4X4. 2018-06-18 01:56:59 -04:00
bunnei
d2277b825e Merge pull request #569 from bunnei/fix-cache
gl_rasterizer_cache: Loosen things up a bit.
2018-06-18 01:32:12 -04:00
bunnei
fe906fff36 gl_rasterizer_cache: Loosen things up a bit. 2018-06-18 00:55:59 -04:00
bunnei
f9af74201c Merge pull request #568 from bunnei/lop
gl_shader_decompiler: Implement LOP instructions.
2018-06-17 17:44:38 -04:00
bunnei
afdd657d30 gl_shader_decompiler: Implement LOP instructions. 2018-06-17 15:27:48 -04:00
bunnei
5673ce39c7 gl_shader_decompiler: Refactor LOP32I instruction a bit in support of LOP. 2018-06-17 13:31:39 -04:00
bunnei
3c43ea5c68 Merge pull request #565 from bunnei/shader_conversions
gl_shader_decompiler: Implement register size conversions for I2I and I2F.
2018-06-16 08:50:29 -04:00
bunnei
d383043e07 gl_shader_decompiler: Implement integer size conversions for I2I/I2F/F2I. 2018-06-15 22:42:02 -04:00
bunnei
fb5bd0920d Merge pull request #564 from bunnei/lop32i_passb
gl_shader_decompiler: Implement LOP32I LogicOperation PassB.
2018-06-15 22:04:03 -04:00
bunnei
46cbb6b090 Merge pull request #566 from bunnei/set_pos_w
gl_shader_gen: Set position.w to 1.
2018-06-15 22:03:48 -04:00
bunnei
55c49d5bf4 gl_shader_gen: Set position.w to 1. 2018-06-15 20:47:04 -04:00
bunnei
61f9d9c4ab gl_shader_decompiler: Implement LOP32I LogicOperation PassB. 2018-06-15 20:43:33 -04:00
Zach Hilman
ac88d3e89f Narrow down filter of layout configs 2018-06-13 20:03:12 -04:00
Zach Hilman
a353322b58 Move loop condition to free function 2018-06-13 13:44:46 -04:00
Zach Hilman
50153a1cb2 Avoid initializing single-joycon layouts with handheld controller 2018-06-13 13:01:05 -04:00
Subv
2a7653142d Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.
This should help a bit with GPU performance once we're GPU-bound.
2018-06-09 18:02:05 -05:00
Michał Janiszewski
5c3d5d0849 Use Ninja for MSVC AppVeyor builds 2018-06-05 22:46:54 +02:00
Michał Janiszewski
79de0f8fe8 Drop /std:c++latest from MSVC command line
CMake already sets it to version 17 in all cases
2018-06-05 22:41:28 +02:00
60 changed files with 3410 additions and 1868 deletions

View File

@@ -40,6 +40,8 @@ add_library(core STATIC
hle/config_mem.h
hle/ipc.h
hle/ipc_helpers.h
hle/kernel/address_arbiter.cpp
hle/kernel/address_arbiter.h
hle/kernel/client_port.cpp
hle/kernel/client_port.h
hle/kernel/client_session.cpp
@@ -124,6 +126,8 @@ add_library(core STATIC
hle/service/audio/audren_u.h
hle/service/audio/codecctl.cpp
hle/service/audio/codecctl.h
hle/service/audio/hwopus.cpp
hle/service/audio/hwopus.h
hle/service/bcat/module.cpp
hle/service/bcat/module.h
hle/service/bcat/bcat.cpp
@@ -257,6 +261,8 @@ add_library(core STATIC
loader/linker.h
loader/loader.cpp
loader/loader.h
loader/nca.cpp
loader/nca.h
loader/nro.cpp
loader/nro.h
loader/nso.cpp

View File

@@ -19,13 +19,20 @@ Loader::ResultStatus PartitionFilesystem::Load(const std::string& file_path, siz
if (file.GetSize() < sizeof(Header))
return Loader::ResultStatus::Error;
file.Seek(offset, SEEK_SET);
// For cartridges, HFSs can get very large, so we need to calculate the size up to
// the actual content itself instead of just blindly reading in the entire file.
Header pfs_header;
if (!file.ReadBytes(&pfs_header, sizeof(Header)))
return Loader::ResultStatus::Error;
bool is_hfs = (memcmp(pfs_header.magic.data(), "HFS", 3) == 0);
if (pfs_header.magic != Common::MakeMagic('H', 'F', 'S', '0') &&
pfs_header.magic != Common::MakeMagic('P', 'F', 'S', '0')) {
return Loader::ResultStatus::ErrorInvalidFormat;
}
bool is_hfs = pfs_header.magic == Common::MakeMagic('H', 'F', 'S', '0');
size_t entry_size = is_hfs ? sizeof(HFSEntry) : sizeof(PFSEntry);
size_t metadata_size =
sizeof(Header) + (pfs_header.num_entries * entry_size) + pfs_header.strtab_size;
@@ -50,7 +57,12 @@ Loader::ResultStatus PartitionFilesystem::Load(const std::vector<u8>& file_data,
return Loader::ResultStatus::Error;
memcpy(&pfs_header, &file_data[offset], sizeof(Header));
is_hfs = (memcmp(pfs_header.magic.data(), "HFS", 3) == 0);
if (pfs_header.magic != Common::MakeMagic('H', 'F', 'S', '0') &&
pfs_header.magic != Common::MakeMagic('P', 'F', 'S', '0')) {
return Loader::ResultStatus::ErrorInvalidFormat;
}
is_hfs = pfs_header.magic == Common::MakeMagic('H', 'F', 'S', '0');
size_t entries_offset = offset + sizeof(Header);
size_t entry_size = is_hfs ? sizeof(HFSEntry) : sizeof(PFSEntry);
@@ -73,21 +85,21 @@ u32 PartitionFilesystem::GetNumEntries() const {
return pfs_header.num_entries;
}
u64 PartitionFilesystem::GetEntryOffset(int index) const {
u64 PartitionFilesystem::GetEntryOffset(u32 index) const {
if (index > GetNumEntries())
return 0;
return content_offset + pfs_entries[index].fs_entry.offset;
}
u64 PartitionFilesystem::GetEntrySize(int index) const {
u64 PartitionFilesystem::GetEntrySize(u32 index) const {
if (index > GetNumEntries())
return 0;
return pfs_entries[index].fs_entry.size;
}
std::string PartitionFilesystem::GetEntryName(int index) const {
std::string PartitionFilesystem::GetEntryName(u32 index) const {
if (index > GetNumEntries())
return "";
@@ -113,7 +125,7 @@ u64 PartitionFilesystem::GetFileSize(const std::string& name) const {
}
void PartitionFilesystem::Print() const {
NGLOG_DEBUG(Service_FS, "Magic: {:.4}", pfs_header.magic.data());
NGLOG_DEBUG(Service_FS, "Magic: {}", pfs_header.magic);
NGLOG_DEBUG(Service_FS, "Files: {}", pfs_header.num_entries);
for (u32 i = 0; i < pfs_header.num_entries; i++) {
NGLOG_DEBUG(Service_FS, " > File {}: {} (0x{:X} bytes, at 0x{:X})", i,

View File

@@ -27,9 +27,9 @@ public:
Loader::ResultStatus Load(const std::vector<u8>& file_data, size_t offset = 0);
u32 GetNumEntries() const;
u64 GetEntryOffset(int index) const;
u64 GetEntrySize(int index) const;
std::string GetEntryName(int index) const;
u64 GetEntryOffset(u32 index) const;
u64 GetEntrySize(u32 index) const;
std::string GetEntryName(u32 index) const;
u64 GetFileOffset(const std::string& name) const;
u64 GetFileSize(const std::string& name) const;
@@ -37,7 +37,7 @@ public:
private:
struct Header {
std::array<char, 4> magic;
u32_le magic;
u32_le num_entries;
u32_le strtab_size;
INSERT_PADDING_BYTES(0x4);

View File

@@ -0,0 +1,173 @@
// Copyright 2018 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "common/assert.h"
#include "common/common_funcs.h"
#include "common/common_types.h"
#include "core/core.h"
#include "core/hle/kernel/errors.h"
#include "core/hle/kernel/kernel.h"
#include "core/hle/kernel/process.h"
#include "core/hle/kernel/thread.h"
#include "core/hle/lock.h"
#include "core/memory.h"
namespace Kernel {
namespace AddressArbiter {
// Performs actual address waiting logic.
static ResultCode WaitForAddress(VAddr address, s64 timeout) {
SharedPtr<Thread> current_thread = GetCurrentThread();
current_thread->arb_wait_address = address;
current_thread->status = THREADSTATUS_WAIT_ARB;
current_thread->wakeup_callback = nullptr;
current_thread->WakeAfterDelay(timeout);
Core::System::GetInstance().CpuCore(current_thread->processor_id).PrepareReschedule();
return RESULT_TIMEOUT;
}
// Gets the threads waiting on an address.
static void GetThreadsWaitingOnAddress(std::vector<SharedPtr<Thread>>& waiting_threads,
VAddr address) {
auto RetrieveWaitingThreads =
[](size_t core_index, std::vector<SharedPtr<Thread>>& waiting_threads, VAddr arb_addr) {
const auto& scheduler = Core::System::GetInstance().Scheduler(core_index);
auto& thread_list = scheduler->GetThreadList();
for (auto& thread : thread_list) {
if (thread->arb_wait_address == arb_addr)
waiting_threads.push_back(thread);
}
};
// Retrieve a list of all threads that are waiting for this address.
RetrieveWaitingThreads(0, waiting_threads, address);
RetrieveWaitingThreads(1, waiting_threads, address);
RetrieveWaitingThreads(2, waiting_threads, address);
RetrieveWaitingThreads(3, waiting_threads, address);
// Sort them by priority, such that the highest priority ones come first.
std::sort(waiting_threads.begin(), waiting_threads.end(),
[](const SharedPtr<Thread>& lhs, const SharedPtr<Thread>& rhs) {
return lhs->current_priority < rhs->current_priority;
});
}
// Wake up num_to_wake (or all) threads in a vector.
static void WakeThreads(std::vector<SharedPtr<Thread>>& waiting_threads, s32 num_to_wake) {
// Only process up to 'target' threads, unless 'target' is <= 0, in which case process
// them all.
size_t last = waiting_threads.size();
if (num_to_wake > 0)
last = num_to_wake;
// Signal the waiting threads.
for (size_t i = 0; i < last; i++) {
ASSERT(waiting_threads[i]->status = THREADSTATUS_WAIT_ARB);
waiting_threads[i]->SetWaitSynchronizationResult(RESULT_SUCCESS);
waiting_threads[i]->arb_wait_address = 0;
waiting_threads[i]->ResumeFromWait();
}
}
// Signals an address being waited on.
ResultCode SignalToAddress(VAddr address, s32 num_to_wake) {
// Get threads waiting on the address.
std::vector<SharedPtr<Thread>> waiting_threads;
GetThreadsWaitingOnAddress(waiting_threads, address);
WakeThreads(waiting_threads, num_to_wake);
return RESULT_SUCCESS;
}
// Signals an address being waited on and increments its value if equal to the value argument.
ResultCode IncrementAndSignalToAddressIfEqual(VAddr address, s32 value, s32 num_to_wake) {
// Ensure that we can write to the address.
if (!Memory::IsValidVirtualAddress(address)) {
return ERR_INVALID_ADDRESS_STATE;
}
if (static_cast<s32>(Memory::Read32(address)) == value) {
Memory::Write32(address, static_cast<u32>(value + 1));
} else {
return ERR_INVALID_STATE;
}
return SignalToAddress(address, num_to_wake);
}
// Signals an address being waited on and modifies its value based on waiting thread count if equal
// to the value argument.
ResultCode ModifyByWaitingCountAndSignalToAddressIfEqual(VAddr address, s32 value,
s32 num_to_wake) {
// Ensure that we can write to the address.
if (!Memory::IsValidVirtualAddress(address)) {
return ERR_INVALID_ADDRESS_STATE;
}
// Get threads waiting on the address.
std::vector<SharedPtr<Thread>> waiting_threads;
GetThreadsWaitingOnAddress(waiting_threads, address);
// Determine the modified value depending on the waiting count.
s32 updated_value;
if (waiting_threads.size() == 0) {
updated_value = value - 1;
} else if (num_to_wake <= 0 || waiting_threads.size() <= num_to_wake) {
updated_value = value + 1;
} else {
updated_value = value;
}
if (static_cast<s32>(Memory::Read32(address)) == value) {
Memory::Write32(address, static_cast<u32>(updated_value));
} else {
return ERR_INVALID_STATE;
}
WakeThreads(waiting_threads, num_to_wake);
return RESULT_SUCCESS;
}
// Waits on an address if the value passed is less than the argument value, optionally decrementing.
ResultCode WaitForAddressIfLessThan(VAddr address, s32 value, s64 timeout, bool should_decrement) {
// Ensure that we can read the address.
if (!Memory::IsValidVirtualAddress(address)) {
return ERR_INVALID_ADDRESS_STATE;
}
s32 cur_value = static_cast<s32>(Memory::Read32(address));
if (cur_value < value) {
Memory::Write32(address, static_cast<u32>(cur_value - 1));
} else {
return ERR_INVALID_STATE;
}
// Short-circuit without rescheduling, if timeout is zero.
if (timeout == 0) {
return RESULT_TIMEOUT;
}
return WaitForAddress(address, timeout);
}
// Waits on an address if the value passed is equal to the argument value.
ResultCode WaitForAddressIfEqual(VAddr address, s32 value, s64 timeout) {
// Ensure that we can read the address.
if (!Memory::IsValidVirtualAddress(address)) {
return ERR_INVALID_ADDRESS_STATE;
}
// Only wait for the address if equal.
if (static_cast<s32>(Memory::Read32(address)) != value) {
return ERR_INVALID_STATE;
}
// Short-circuit without rescheduling, if timeout is zero.
if (timeout == 0) {
return RESULT_TIMEOUT;
}
return WaitForAddress(address, timeout);
}
} // namespace AddressArbiter
} // namespace Kernel

View File

@@ -0,0 +1,32 @@
// Copyright 2018 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "core/hle/result.h"
namespace Kernel {
namespace AddressArbiter {
enum class ArbitrationType {
WaitIfLessThan = 0,
DecrementAndWaitIfLessThan = 1,
WaitIfEqual = 2,
};
enum class SignalType {
Signal = 0,
IncrementAndSignalIfEqual = 1,
ModifyByWaitingCountAndSignalIfEqual = 2,
};
ResultCode SignalToAddress(VAddr address, s32 num_to_wake);
ResultCode IncrementAndSignalToAddressIfEqual(VAddr address, s32 value, s32 num_to_wake);
ResultCode ModifyByWaitingCountAndSignalToAddressIfEqual(VAddr address, s32 value, s32 num_to_wake);
ResultCode WaitForAddressIfLessThan(VAddr address, s32 value, s64 timeout, bool should_decrement);
ResultCode WaitForAddressIfEqual(VAddr address, s32 value, s64 timeout);
} // namespace AddressArbiter
} // namespace Kernel

View File

@@ -20,13 +20,16 @@ enum {
MaxConnectionsReached = 52,
// Confirmed Switch OS error codes
MisalignedAddress = 102,
InvalidAddress = 102,
InvalidMemoryState = 106,
InvalidProcessorId = 113,
InvalidHandle = 114,
InvalidCombination = 116,
Timeout = 117,
SynchronizationCanceled = 118,
TooLarge = 119,
InvalidEnumValue = 120,
InvalidState = 125,
};
}
@@ -39,14 +42,15 @@ constexpr ResultCode ERR_SESSION_CLOSED_BY_REMOTE(-1);
constexpr ResultCode ERR_PORT_NAME_TOO_LONG(-1);
constexpr ResultCode ERR_WRONG_PERMISSION(-1);
constexpr ResultCode ERR_MAX_CONNECTIONS_REACHED(-1);
constexpr ResultCode ERR_INVALID_ENUM_VALUE(-1);
constexpr ResultCode ERR_INVALID_ENUM_VALUE(ErrorModule::Kernel, ErrCodes::InvalidEnumValue);
constexpr ResultCode ERR_INVALID_ENUM_VALUE_FND(-1);
constexpr ResultCode ERR_INVALID_COMBINATION(-1);
constexpr ResultCode ERR_INVALID_COMBINATION_KERNEL(-1);
constexpr ResultCode ERR_OUT_OF_MEMORY(-1);
constexpr ResultCode ERR_INVALID_ADDRESS(-1);
constexpr ResultCode ERR_INVALID_ADDRESS_STATE(-1);
constexpr ResultCode ERR_INVALID_ADDRESS(ErrorModule::Kernel, ErrCodes::InvalidAddress);
constexpr ResultCode ERR_INVALID_ADDRESS_STATE(ErrorModule::Kernel, ErrCodes::InvalidMemoryState);
constexpr ResultCode ERR_INVALID_HANDLE(ErrorModule::Kernel, ErrCodes::InvalidHandle);
constexpr ResultCode ERR_INVALID_STATE(ErrorModule::Kernel, ErrCodes::InvalidState);
constexpr ResultCode ERR_INVALID_POINTER(-1);
constexpr ResultCode ERR_INVALID_OBJECT_ADDR(-1);
constexpr ResultCode ERR_NOT_AUTHORIZED(-1);

View File

@@ -271,6 +271,11 @@ std::vector<u8> HLERequestContext::ReadBuffer(int buffer_index) const {
}
size_t HLERequestContext::WriteBuffer(const void* buffer, size_t size, int buffer_index) const {
if (size == 0) {
NGLOG_WARNING(Core, "skip empty buffer write");
return 0;
}
const bool is_buffer_b{BufferDescriptorB().size() && BufferDescriptorB()[buffer_index].Size()};
const size_t buffer_size{GetWriteBufferSize(buffer_index)};
if (size > buffer_size) {

View File

@@ -59,7 +59,7 @@ ResultCode Mutex::TryAcquire(VAddr address, Handle holding_thread_handle,
Handle requesting_thread_handle) {
// The mutex address must be 4-byte aligned
if ((address % sizeof(u32)) != 0) {
return ResultCode(ErrorModule::Kernel, ErrCodes::MisalignedAddress);
return ResultCode(ErrorModule::Kernel, ErrCodes::InvalidAddress);
}
SharedPtr<Thread> holding_thread = g_handle_table.Get<Thread>(holding_thread_handle);
@@ -97,7 +97,7 @@ ResultCode Mutex::TryAcquire(VAddr address, Handle holding_thread_handle,
ResultCode Mutex::Release(VAddr address) {
// The mutex address must be 4-byte aligned
if ((address % sizeof(u32)) != 0) {
return ResultCode(ErrorModule::Kernel, ErrCodes::MisalignedAddress);
return ResultCode(ErrorModule::Kernel, ErrCodes::InvalidAddress);
}
auto [thread, num_waiters] = GetHighestPriorityMutexWaitingThread(GetCurrentThread(), address);

View File

@@ -11,6 +11,7 @@
#include "common/string_util.h"
#include "core/core.h"
#include "core/core_timing.h"
#include "core/hle/kernel/address_arbiter.h"
#include "core/hle/kernel/client_port.h"
#include "core/hle/kernel/client_session.h"
#include "core/hle/kernel/event.h"
@@ -316,6 +317,11 @@ static ResultCode GetInfo(u64* result, u64 info_id, u64 handle, u64 info_sub_id)
"(STUBBED) Attempted to query privileged process id bounds, returned 0");
*result = 0;
break;
case GetInfoType::UserExceptionContextAddr:
NGLOG_WARNING(Kernel_SVC,
"(STUBBED) Attempted to query user exception context address, returned 0");
*result = 0;
break;
default:
UNIMPLEMENTED();
}
@@ -575,7 +581,7 @@ static void SleepThread(s64 nanoseconds) {
Core::System::GetInstance().PrepareReschedule();
}
/// Signal process wide key atomic
/// Wait process wide key atomic
static ResultCode WaitProcessWideKeyAtomic(VAddr mutex_addr, VAddr condition_variable_addr,
Handle thread_handle, s64 nano_seconds) {
NGLOG_TRACE(
@@ -684,6 +690,58 @@ static ResultCode SignalProcessWideKey(VAddr condition_variable_addr, s32 target
return RESULT_SUCCESS;
}
// Wait for an address (via Address Arbiter)
static ResultCode WaitForAddress(VAddr address, u32 type, s32 value, s64 timeout) {
NGLOG_WARNING(Kernel_SVC, "called, address=0x{:X}, type=0x{:X}, value=0x{:X}, timeout={}",
address, type, value, timeout);
// If the passed address is a kernel virtual address, return invalid memory state.
if (Memory::IsKernelVirtualAddress(address)) {
return ERR_INVALID_ADDRESS_STATE;
}
// If the address is not properly aligned to 4 bytes, return invalid address.
if (address % sizeof(u32) != 0) {
return ERR_INVALID_ADDRESS;
}
switch (static_cast<AddressArbiter::ArbitrationType>(type)) {
case AddressArbiter::ArbitrationType::WaitIfLessThan:
return AddressArbiter::WaitForAddressIfLessThan(address, value, timeout, false);
case AddressArbiter::ArbitrationType::DecrementAndWaitIfLessThan:
return AddressArbiter::WaitForAddressIfLessThan(address, value, timeout, true);
case AddressArbiter::ArbitrationType::WaitIfEqual:
return AddressArbiter::WaitForAddressIfEqual(address, value, timeout);
default:
return ERR_INVALID_ENUM_VALUE;
}
}
// Signals to an address (via Address Arbiter)
static ResultCode SignalToAddress(VAddr address, u32 type, s32 value, s32 num_to_wake) {
NGLOG_WARNING(Kernel_SVC,
"called, address=0x{:X}, type=0x{:X}, value=0x{:X}, num_to_wake=0x{:X}", address,
type, value, num_to_wake);
// If the passed address is a kernel virtual address, return invalid memory state.
if (Memory::IsKernelVirtualAddress(address)) {
return ERR_INVALID_ADDRESS_STATE;
}
// If the address is not properly aligned to 4 bytes, return invalid address.
if (address % sizeof(u32) != 0) {
return ERR_INVALID_ADDRESS;
}
switch (static_cast<AddressArbiter::SignalType>(type)) {
case AddressArbiter::SignalType::Signal:
return AddressArbiter::SignalToAddress(address, num_to_wake);
case AddressArbiter::SignalType::IncrementAndSignalIfEqual:
return AddressArbiter::IncrementAndSignalToAddressIfEqual(address, value, num_to_wake);
case AddressArbiter::SignalType::ModifyByWaitingCountAndSignalIfEqual:
return AddressArbiter::ModifyByWaitingCountAndSignalToAddressIfEqual(address, value,
num_to_wake);
default:
return ERR_INVALID_ENUM_VALUE;
}
}
/// This returns the total CPU ticks elapsed since the CPU was powered-on
static u64 GetSystemTick() {
const u64 result{CoreTiming::GetTicks()};
@@ -744,7 +802,7 @@ static ResultCode SetThreadCoreMask(Handle thread_handle, u32 core, u64 mask) {
ASSERT(thread->owner_process->ideal_processor != THREADPROCESSORID_DEFAULT);
// Set the target CPU to the one specified in the process' exheader.
core = thread->owner_process->ideal_processor;
mask = 1 << core;
mask = 1ull << core;
}
if (mask == 0) {
@@ -761,7 +819,7 @@ static ResultCode SetThreadCoreMask(Handle thread_handle, u32 core, u64 mask) {
}
// Error out if the input core isn't enabled in the input mask.
if (core < Core::NUM_CPU_CORES && (mask & (1 << core)) == 0) {
if (core < Core::NUM_CPU_CORES && (mask & (1ull << core)) == 0) {
return ResultCode(ErrorModule::Kernel, ErrCodes::InvalidCombination);
}
@@ -856,8 +914,8 @@ static const FunctionDef SVC_Table[] = {
{0x31, nullptr, "GetResourceLimitCurrentValue"},
{0x32, SvcWrap<SetThreadActivity>, "SetThreadActivity"},
{0x33, SvcWrap<GetThreadContext>, "GetThreadContext"},
{0x34, nullptr, "WaitForAddress"},
{0x35, nullptr, "SignalToAddress"},
{0x34, SvcWrap<WaitForAddress>, "WaitForAddress"},
{0x35, SvcWrap<SignalToAddress>, "SignalToAddress"},
{0x36, nullptr, "Unknown"},
{0x37, nullptr, "Unknown"},
{0x38, nullptr, "Unknown"},

View File

@@ -179,6 +179,20 @@ void SvcWrap() {
FuncReturn(retval);
}
template <ResultCode func(u64, u32, s32, s64)>
void SvcWrap() {
FuncReturn(
func(PARAM(0), (u32)(PARAM(1) & 0xFFFFFFFF), (s32)(PARAM(2) & 0xFFFFFFFF), (s64)PARAM(3))
.raw);
}
template <ResultCode func(u64, u32, s32, s32)>
void SvcWrap() {
FuncReturn(func(PARAM(0), (u32)(PARAM(1) & 0xFFFFFFFF), (s32)(PARAM(2) & 0xFFFFFFFF),
(s32)(PARAM(3) & 0xFFFFFFFF))
.raw);
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// Function wrappers that return type u32

View File

@@ -140,6 +140,11 @@ static void ThreadWakeupCallback(u64 thread_handle, int cycles_late) {
}
}
if (thread->arb_wait_address != 0) {
ASSERT(thread->status == THREADSTATUS_WAIT_ARB);
thread->arb_wait_address = 0;
}
if (resume)
thread->ResumeFromWait();
}
@@ -179,6 +184,7 @@ void Thread::ResumeFromWait() {
case THREADSTATUS_WAIT_SLEEP:
case THREADSTATUS_WAIT_IPC:
case THREADSTATUS_WAIT_MUTEX:
case THREADSTATUS_WAIT_ARB:
break;
case THREADSTATUS_READY:

View File

@@ -45,6 +45,7 @@ enum ThreadStatus {
THREADSTATUS_WAIT_SYNCH_ANY, ///< Waiting due to WaitSynch1 or WaitSynchN with wait_all = false
THREADSTATUS_WAIT_SYNCH_ALL, ///< Waiting due to WaitSynchronizationN with wait_all = true
THREADSTATUS_WAIT_MUTEX, ///< Waiting due to an ArbitrateLock/WaitProcessWideKey svc
THREADSTATUS_WAIT_ARB, ///< Waiting due to a SignalToAddress/WaitForAddress svc
THREADSTATUS_DORMANT, ///< Created but not yet made ready
THREADSTATUS_DEAD ///< Run to completion, or forcefully terminated
};
@@ -230,6 +231,9 @@ public:
VAddr mutex_wait_address; ///< If waiting on a Mutex, this is the mutex address
Handle wait_handle; ///< The handle used to wait for the mutex.
// If waiting for an AddressArbiter, this is the address being waited on.
VAddr arb_wait_address{0};
std::string name;
/// Handle used by guest emulated application to access this thread

View File

@@ -8,6 +8,7 @@
#include "core/hle/service/audio/audrec_u.h"
#include "core/hle/service/audio/audren_u.h"
#include "core/hle/service/audio/codecctl.h"
#include "core/hle/service/audio/hwopus.h"
namespace Service::Audio {
@@ -17,6 +18,7 @@ void InstallInterfaces(SM::ServiceManager& service_manager) {
std::make_shared<AudRecU>()->InstallAsService(service_manager);
std::make_shared<AudRenU>()->InstallAsService(service_manager);
std::make_shared<CodecCtl>()->InstallAsService(service_manager);
std::make_shared<HwOpus>()->InstallAsService(service_manager);
}
} // namespace Service::Audio

View File

@@ -17,7 +17,8 @@ constexpr u64 audio_ticks{static_cast<u64>(CoreTiming::BASE_CLOCK_RATE / 200)};
class IAudioRenderer final : public ServiceFramework<IAudioRenderer> {
public:
IAudioRenderer() : ServiceFramework("IAudioRenderer") {
IAudioRenderer(AudioRendererParameter audren_params)
: ServiceFramework("IAudioRenderer"), worker_params(audren_params) {
static const FunctionInfo functions[] = {
{0, nullptr, "GetAudioRendererSampleRate"},
{1, nullptr, "GetAudioRendererSampleCount"},
@@ -57,27 +58,37 @@ private:
}
void RequestUpdateAudioRenderer(Kernel::HLERequestContext& ctx) {
NGLOG_DEBUG(Service_Audio, "{}", ctx.Description());
AudioRendererResponseData response_data{};
UpdateDataHeader config{};
auto buf = ctx.ReadBuffer();
std::memcpy(&config, buf.data(), sizeof(UpdateDataHeader));
u32 memory_pool_count = worker_params.effect_count + (worker_params.voice_count * 4);
response_data.section_0_size =
static_cast<u32>(response_data.state_entries.size() * sizeof(AudioRendererStateEntry));
response_data.section_1_size = static_cast<u32>(response_data.section_1.size());
response_data.section_2_size = static_cast<u32>(response_data.section_2.size());
response_data.section_3_size = static_cast<u32>(response_data.section_3.size());
response_data.section_4_size = static_cast<u32>(response_data.section_4.size());
response_data.section_5_size = static_cast<u32>(response_data.section_5.size());
response_data.total_size = sizeof(AudioRendererResponseData);
std::vector<MemoryPoolInfo> mem_pool_info(memory_pool_count);
std::memcpy(mem_pool_info.data(),
buf.data() + sizeof(UpdateDataHeader) + config.behavior_size,
memory_pool_count * sizeof(MemoryPoolInfo));
for (unsigned i = 0; i < response_data.state_entries.size(); i++) {
// 4 = Busy and 5 = Ready?
response_data.state_entries[i].state = 5;
UpdateDataHeader response_data{worker_params};
ASSERT(ctx.GetWriteBufferSize() == response_data.total_size);
std::vector<u8> output(response_data.total_size);
std::memcpy(output.data(), &response_data, sizeof(UpdateDataHeader));
std::vector<MemoryPoolEntry> memory_pool(memory_pool_count);
for (unsigned i = 0; i < memory_pool.size(); i++) {
if (mem_pool_info[i].pool_state == MemoryPoolStates::RequestAttach)
memory_pool[i].state = MemoryPoolStates::Attached;
else if (mem_pool_info[i].pool_state == MemoryPoolStates::RequestDetach)
memory_pool[i].state = MemoryPoolStates::Detached;
else
memory_pool[i].state = mem_pool_info[i].pool_state;
}
std::memcpy(output.data() + sizeof(UpdateDataHeader), memory_pool.data(),
response_data.memory_pools_size);
ctx.WriteBuffer(&response_data, response_data.total_size);
ctx.WriteBuffer(output);
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(RESULT_SUCCESS);
NGLOG_WARNING(Service_Audio, "(STUBBED) called");
@@ -109,48 +120,66 @@ private:
NGLOG_WARNING(Service_Audio, "(STUBBED) called");
}
struct AudioRendererStateEntry {
u32_le state;
enum class MemoryPoolStates : u32 { // Should be LE
Invalid = 0x0,
Unknown = 0x1,
RequestDetach = 0x2,
Detached = 0x3,
RequestAttach = 0x4,
Attached = 0x5,
Released = 0x6,
};
struct MemoryPoolEntry {
MemoryPoolStates state;
u32_le unknown_4;
u32_le unknown_8;
u32_le unknown_c;
};
static_assert(sizeof(AudioRendererStateEntry) == 0x10,
"AudioRendererStateEntry has wrong size");
static_assert(sizeof(MemoryPoolEntry) == 0x10, "MemoryPoolEntry has wrong size");
struct AudioRendererResponseData {
u32_le unknown_0;
u32_le section_5_size;
u32_le section_0_size;
u32_le section_1_size;
u32_le unknown_10;
u32_le section_2_size;
u32_le unknown_18;
u32_le section_3_size;
u32_le section_4_size;
u32_le unknown_24;
u32_le unknown_28;
u32_le unknown_2c;
u32_le unknown_30;
u32_le unknown_34;
u32_le unknown_38;
u32_le total_size;
std::array<AudioRendererStateEntry, 0x18e> state_entries;
std::array<u8, 0x600> section_1;
std::array<u8, 0xe0> section_2;
std::array<u8, 0x20> section_3;
std::array<u8, 0x10> section_4;
std::array<u8, 0xb0> section_5;
struct MemoryPoolInfo {
u64_le pool_address;
u64_le pool_size;
MemoryPoolStates pool_state;
INSERT_PADDING_WORDS(3); // Unknown
};
static_assert(sizeof(AudioRendererResponseData) == 0x20e0,
"AudioRendererResponseData has wrong size");
static_assert(sizeof(MemoryPoolInfo) == 0x20, "MemoryPoolInfo has wrong size");
struct UpdateDataHeader {
UpdateDataHeader() {}
UpdateDataHeader(const AudioRendererParameter& config) {
revision = Common::MakeMagic('R', 'E', 'V', '4'); // 5.1.0 Revision
behavior_size = 0xb0;
memory_pools_size = (config.effect_count + (config.voice_count * 4)) * 0x10;
voices_size = config.voice_count * 0x10;
effects_size = config.effect_count * 0x10;
sinks_size = config.sink_count * 0x20;
performance_manager_size = 0x10;
total_size = sizeof(UpdateDataHeader) + behavior_size + memory_pools_size +
voices_size + effects_size + sinks_size + performance_manager_size;
}
u32_le revision;
u32_le behavior_size;
u32_le memory_pools_size;
u32_le voices_size;
u32_le voice_resource_size;
u32_le effects_size;
u32_le mixes_size;
u32_le sinks_size;
u32_le performance_manager_size;
INSERT_PADDING_WORDS(6);
u32_le total_size;
};
static_assert(sizeof(UpdateDataHeader) == 0x40, "UpdateDataHeader has wrong size");
/// This is used to trigger the audio event callback.
CoreTiming::EventType* audio_event;
Kernel::SharedPtr<Kernel::Event> system_event;
AudioRendererParameter worker_params;
};
class IAudioDevice final : public ServiceFramework<IAudioDevice> {
@@ -248,31 +277,33 @@ AudRenU::AudRenU() : ServiceFramework("audren:u") {
}
void AudRenU::OpenAudioRenderer(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
auto params = rp.PopRaw<AudioRendererParameter>();
IPC::ResponseBuilder rb{ctx, 2, 0, 1};
rb.Push(RESULT_SUCCESS);
rb.PushIpcInterface<Audio::IAudioRenderer>();
rb.PushIpcInterface<Audio::IAudioRenderer>(std::move(params));
NGLOG_DEBUG(Service_Audio, "called");
}
void AudRenU::GetAudioRendererWorkBufferSize(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
auto params = rp.PopRaw<WorkerBufferParameters>();
auto params = rp.PopRaw<AudioRendererParameter>();
u64 buffer_sz = Common::AlignUp(4 * params.unknown8, 0x40);
buffer_sz += params.unknownC * 1024;
buffer_sz += 0x940 * (params.unknownC + 1);
u64 buffer_sz = Common::AlignUp(4 * params.unknown_8, 0x40);
buffer_sz += params.unknown_c * 1024;
buffer_sz += 0x940 * (params.unknown_c + 1);
buffer_sz += 0x3F0 * params.voice_count;
buffer_sz += Common::AlignUp(8 * (params.unknownC + 1), 0x10);
buffer_sz += Common::AlignUp(8 * (params.unknown_c + 1), 0x10);
buffer_sz += Common::AlignUp(8 * params.voice_count, 0x10);
buffer_sz +=
Common::AlignUp((0x3C0 * (params.sink_count + params.unknownC) + 4 * params.sample_count) *
(params.unknown8 + 6),
Common::AlignUp((0x3C0 * (params.sink_count + params.unknown_c) + 4 * params.sample_count) *
(params.unknown_8 + 6),
0x40);
if (IsFeatureSupported(AudioFeatures::Splitter, params.magic)) {
u32 count = params.unknownC + 1;
if (IsFeatureSupported(AudioFeatures::Splitter, params.revision)) {
u32 count = params.unknown_c + 1;
u64 node_count = Common::AlignUp(count, 0x40);
u64 node_state_buffer_sz =
4 * (node_count * node_count) + 0xC * node_count + 2 * (node_count / 8);
@@ -287,20 +318,20 @@ void AudRenU::GetAudioRendererWorkBufferSize(Kernel::HLERequestContext& ctx) {
}
buffer_sz += 0x20 * (params.effect_count + 4 * params.voice_count) + 0x50;
if (IsFeatureSupported(AudioFeatures::Splitter, params.magic)) {
buffer_sz += 0xE0 * params.unknown2c;
if (IsFeatureSupported(AudioFeatures::Splitter, params.revision)) {
buffer_sz += 0xE0 * params.unknown_2c;
buffer_sz += 0x20 * params.splitter_count;
buffer_sz += Common::AlignUp(4 * params.unknown2c, 0x10);
buffer_sz += Common::AlignUp(4 * params.unknown_2c, 0x10);
}
buffer_sz = Common::AlignUp(buffer_sz, 0x40) + 0x170 * params.sink_count;
u64 output_sz = buffer_sz + 0x280 * params.sink_count + 0x4B0 * params.effect_count +
((params.voice_count * 256) | 0x40);
if (params.unknown1c >= 1) {
if (params.unknown_1c >= 1) {
output_sz = Common::AlignUp(((16 * params.sink_count + 16 * params.effect_count +
16 * params.voice_count + 16) +
0x658) *
(params.unknown1c + 1) +
(params.unknown_1c + 1) +
0xc0,
0x40) +
output_sz;
@@ -328,7 +359,7 @@ bool AudRenU::IsFeatureSupported(AudioFeatures feature, u32_le revision) const {
u32_be version_num = (revision - Common::MakeMagic('R', 'E', 'V', '0')); // Byte swap
switch (feature) {
case AudioFeatures::Splitter:
return version_num >= 2;
return version_num >= 2u;
default:
return false;
}

View File

@@ -12,6 +12,24 @@ class HLERequestContext;
namespace Service::Audio {
struct AudioRendererParameter {
u32_le sample_rate;
u32_le sample_count;
u32_le unknown_8;
u32_le unknown_c;
u32_le voice_count;
u32_le sink_count;
u32_le effect_count;
u32_le unknown_1c;
u8 unknown_20;
INSERT_PADDING_BYTES(3);
u32_le splitter_count;
u32_le unknown_2c;
INSERT_PADDING_WORDS(1);
u32_le revision;
};
static_assert(sizeof(AudioRendererParameter) == 52, "AudioRendererParameter is an invalid size");
class AudRenU final : public ServiceFramework<AudRenU> {
public:
explicit AudRenU();
@@ -22,25 +40,6 @@ private:
void GetAudioRendererWorkBufferSize(Kernel::HLERequestContext& ctx);
void GetAudioDevice(Kernel::HLERequestContext& ctx);
struct WorkerBufferParameters {
u32_le sample_rate;
u32_le sample_count;
u32_le unknown8;
u32_le unknownC;
u32_le voice_count;
u32_le sink_count;
u32_le effect_count;
u32_le unknown1c;
u8 unknown20;
u8 padding1[3];
u32_le splitter_count;
u32_le unknown2c;
u8 padding2[4];
u32_le magic;
};
static_assert(sizeof(WorkerBufferParameters) == 52,
"WorkerBufferParameters is an invalid size");
enum class AudioFeatures : u32 {
Splitter,
};

View File

@@ -0,0 +1,29 @@
// Copyright 2018 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "common/logging/log.h"
#include "core/hle/ipc_helpers.h"
#include "core/hle/kernel/hle_ipc.h"
#include "core/hle/service/audio/hwopus.h"
namespace Service::Audio {
void HwOpus::GetWorkBufferSize(Kernel::HLERequestContext& ctx) {
NGLOG_WARNING(Service_Audio, "(STUBBED) called");
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<u32>(0x4000);
}
HwOpus::HwOpus() : ServiceFramework("hwopus") {
static const FunctionInfo functions[] = {
{0, nullptr, "Initialize"},
{1, &HwOpus::GetWorkBufferSize, "GetWorkBufferSize"},
{2, nullptr, "InitializeMultiStream"},
{3, nullptr, "GetWorkBufferSizeMultiStream"},
};
RegisterHandlers(functions);
}
} // namespace Service::Audio

View File

@@ -0,0 +1,20 @@
// Copyright 2018 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "core/hle/service/service.h"
namespace Service::Audio {
class HwOpus final : public ServiceFramework<HwOpus> {
public:
explicit HwOpus();
~HwOpus() = default;
private:
void GetWorkBufferSize(Kernel::HLERequestContext& ctx);
};
} // namespace Service::Audio

View File

@@ -84,6 +84,10 @@ private:
for (size_t controller = 0; controller < mem.controllers.size(); controller++) {
for (int index = 0; index < HID_NUM_LAYOUTS; index++) {
// TODO(DarkLordZach): Is this layout/controller config actually invalid?
if (controller == Controller_Handheld && index == Layout_Single)
continue;
ControllerLayout& layout = mem.controllers[controller].layouts[index];
layout.header.num_entries = HID_NUM_ENTRIES;
layout.header.max_entry_index = HID_NUM_ENTRIES - 1;

View File

@@ -121,8 +121,9 @@ u32 nvhost_gpu::AllocateObjectContext(const std::vector<u8>& input, std::vector<
}
u32 nvhost_gpu::SubmitGPFIFO(const std::vector<u8>& input, std::vector<u8>& output) {
if (input.size() < sizeof(IoctlSubmitGpfifo))
if (input.size() < sizeof(IoctlSubmitGpfifo)) {
UNIMPLEMENTED();
}
IoctlSubmitGpfifo params{};
std::memcpy(&params, input.data(), sizeof(IoctlSubmitGpfifo));
NGLOG_WARNING(Service_NVDRV, "(STUBBED) called, gpfifo={:X}, num_entries={:X}, flags={:X}",

View File

@@ -12,9 +12,6 @@
namespace Service::Set {
void SET::GetAvailableLanguageCodes(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
u32 id = rp.Pop<u32>();
static constexpr std::array<LanguageCode, 17> available_language_codes = {{
LanguageCode::JA,
LanguageCode::EN_US,
@@ -50,7 +47,7 @@ SET::SET() : ServiceFramework("set") {
{2, nullptr, "MakeLanguageCode"},
{3, nullptr, "GetAvailableLanguageCodeCount"},
{4, nullptr, "GetRegionCode"},
{5, nullptr, "GetAvailableLanguageCodes2"},
{5, &SET::GetAvailableLanguageCodes, "GetAvailableLanguageCodes2"},
{6, nullptr, "GetAvailableLanguageCodeCount2"},
{7, nullptr, "GetKeyCodeMap"},
{8, nullptr, "GetQuestFlag"},

View File

@@ -9,6 +9,7 @@
#include "core/hle/kernel/process.h"
#include "core/loader/deconstructed_rom_directory.h"
#include "core/loader/elf.h"
#include "core/loader/nca.h"
#include "core/loader/nro.h"
#include "core/loader/nso.h"
@@ -32,6 +33,7 @@ FileType IdentifyFile(FileUtil::IOFile& file, const std::string& filepath) {
CHECK_TYPE(ELF)
CHECK_TYPE(NSO)
CHECK_TYPE(NRO)
CHECK_TYPE(NCA)
#undef CHECK_TYPE
@@ -57,6 +59,8 @@ FileType GuessFromExtension(const std::string& extension_) {
return FileType::NRO;
else if (extension == ".nso")
return FileType::NSO;
else if (extension == ".nca")
return FileType::NCA;
return FileType::Unknown;
}
@@ -69,6 +73,8 @@ const char* GetFileTypeString(FileType type) {
return "NRO";
case FileType::NSO:
return "NSO";
case FileType::NCA:
return "NCA";
case FileType::DeconstructedRomDirectory:
return "Directory";
case FileType::Error:
@@ -104,6 +110,10 @@ static std::unique_ptr<AppLoader> GetFileLoader(FileUtil::IOFile&& file, FileTyp
case FileType::NRO:
return std::make_unique<AppLoader_NRO>(std::move(file), filepath);
// NX NCA file format.
case FileType::NCA:
return std::make_unique<AppLoader_NCA>(std::move(file), filepath);
// NX deconstructed ROM directory.
case FileType::DeconstructedRomDirectory:
return std::make_unique<AppLoader_DeconstructedRomDirectory>(std::move(file), filepath);

View File

@@ -29,6 +29,7 @@ enum class FileType {
ELF,
NSO,
NRO,
NCA,
DeconstructedRomDirectory,
};

303
src/core/loader/nca.cpp Normal file
View File

@@ -0,0 +1,303 @@
// Copyright 2018 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <vector>
#include "common/common_funcs.h"
#include "common/file_util.h"
#include "common/logging/log.h"
#include "common/swap.h"
#include "core/core.h"
#include "core/file_sys/program_metadata.h"
#include "core/file_sys/romfs_factory.h"
#include "core/hle/kernel/process.h"
#include "core/hle/kernel/resource_limit.h"
#include "core/hle/service/filesystem/filesystem.h"
#include "core/loader/nca.h"
#include "core/loader/nso.h"
#include "core/memory.h"
namespace Loader {
// Media offsets in headers are stored divided by 512. Mult. by this to get real offset.
constexpr u64 MEDIA_OFFSET_MULTIPLIER = 0x200;
constexpr u64 SECTION_HEADER_SIZE = 0x200;
constexpr u64 SECTION_HEADER_OFFSET = 0x400;
enum class NcaContentType : u8 { Program = 0, Meta = 1, Control = 2, Manual = 3, Data = 4 };
enum class NcaSectionFilesystemType : u8 { PFS0 = 0x2, ROMFS = 0x3 };
struct NcaSectionTableEntry {
u32_le media_offset;
u32_le media_end_offset;
INSERT_PADDING_BYTES(0x8);
};
static_assert(sizeof(NcaSectionTableEntry) == 0x10, "NcaSectionTableEntry has incorrect size.");
struct NcaHeader {
std::array<u8, 0x100> rsa_signature_1;
std::array<u8, 0x100> rsa_signature_2;
u32_le magic;
u8 is_system;
NcaContentType content_type;
u8 crypto_type;
u8 key_index;
u64_le size;
u64_le title_id;
INSERT_PADDING_BYTES(0x4);
u32_le sdk_version;
u8 crypto_type_2;
INSERT_PADDING_BYTES(15);
std::array<u8, 0x10> rights_id;
std::array<NcaSectionTableEntry, 0x4> section_tables;
std::array<std::array<u8, 0x20>, 0x4> hash_tables;
std::array<std::array<u8, 0x10>, 0x4> key_area;
INSERT_PADDING_BYTES(0xC0);
};
static_assert(sizeof(NcaHeader) == 0x400, "NcaHeader has incorrect size.");
struct NcaSectionHeaderBlock {
INSERT_PADDING_BYTES(3);
NcaSectionFilesystemType filesystem_type;
u8 crypto_type;
INSERT_PADDING_BYTES(3);
};
static_assert(sizeof(NcaSectionHeaderBlock) == 0x8, "NcaSectionHeaderBlock has incorrect size.");
struct Pfs0Superblock {
NcaSectionHeaderBlock header_block;
std::array<u8, 0x20> hash;
u32_le size;
INSERT_PADDING_BYTES(4);
u64_le hash_table_offset;
u64_le hash_table_size;
u64_le pfs0_header_offset;
u64_le pfs0_size;
INSERT_PADDING_BYTES(432);
};
static_assert(sizeof(Pfs0Superblock) == 0x200, "Pfs0Superblock has incorrect size.");
static bool IsValidNca(const NcaHeader& header) {
return header.magic == Common::MakeMagic('N', 'C', 'A', '2') ||
header.magic == Common::MakeMagic('N', 'C', 'A', '3');
}
// TODO(DarkLordZach): Add support for encrypted.
class Nca final {
std::vector<FileSys::PartitionFilesystem> pfs;
std::vector<u64> pfs_offset;
u64 romfs_offset = 0;
u64 romfs_size = 0;
boost::optional<u8> exefs_id = boost::none;
FileUtil::IOFile file;
std::string path;
u64 GetExeFsFileOffset(const std::string& file_name) const;
u64 GetExeFsFileSize(const std::string& file_name) const;
public:
ResultStatus Load(FileUtil::IOFile&& file, std::string path);
FileSys::PartitionFilesystem GetPfs(u8 id) const;
u64 GetRomFsOffset() const;
u64 GetRomFsSize() const;
std::vector<u8> GetExeFsFile(const std::string& file_name);
};
static bool IsPfsExeFs(const FileSys::PartitionFilesystem& pfs) {
// According to switchbrew, an exefs must only contain these two files:
return pfs.GetFileSize("main") > 0 && pfs.GetFileSize("main.npdm") > 0;
}
ResultStatus Nca::Load(FileUtil::IOFile&& in_file, std::string in_path) {
file = std::move(in_file);
path = in_path;
file.Seek(0, SEEK_SET);
std::array<u8, sizeof(NcaHeader)> header_array{};
if (sizeof(NcaHeader) != file.ReadBytes(header_array.data(), sizeof(NcaHeader)))
NGLOG_CRITICAL(Loader, "File reader errored out during header read.");
NcaHeader header{};
std::memcpy(&header, header_array.data(), sizeof(NcaHeader));
if (!IsValidNca(header))
return ResultStatus::ErrorInvalidFormat;
int number_sections =
std::count_if(std::begin(header.section_tables), std::end(header.section_tables),
[](NcaSectionTableEntry entry) { return entry.media_offset > 0; });
for (int i = 0; i < number_sections; ++i) {
// Seek to beginning of this section.
file.Seek(SECTION_HEADER_OFFSET + i * SECTION_HEADER_SIZE, SEEK_SET);
std::array<u8, sizeof(NcaSectionHeaderBlock)> array{};
if (sizeof(NcaSectionHeaderBlock) !=
file.ReadBytes(array.data(), sizeof(NcaSectionHeaderBlock)))
NGLOG_CRITICAL(Loader, "File reader errored out during header read.");
NcaSectionHeaderBlock block{};
std::memcpy(&block, array.data(), sizeof(NcaSectionHeaderBlock));
if (block.filesystem_type == NcaSectionFilesystemType::ROMFS) {
romfs_offset = header.section_tables[i].media_offset * MEDIA_OFFSET_MULTIPLIER;
romfs_size =
header.section_tables[i].media_end_offset * MEDIA_OFFSET_MULTIPLIER - romfs_offset;
} else if (block.filesystem_type == NcaSectionFilesystemType::PFS0) {
Pfs0Superblock sb{};
// Seek back to beginning of this section.
file.Seek(SECTION_HEADER_OFFSET + i * SECTION_HEADER_SIZE, SEEK_SET);
if (sizeof(Pfs0Superblock) != file.ReadBytes(&sb, sizeof(Pfs0Superblock)))
NGLOG_CRITICAL(Loader, "File reader errored out during header read.");
u64 offset = (static_cast<u64>(header.section_tables[i].media_offset) *
MEDIA_OFFSET_MULTIPLIER) +
sb.pfs0_header_offset;
FileSys::PartitionFilesystem npfs{};
ResultStatus status = npfs.Load(path, offset);
if (status == ResultStatus::Success) {
pfs.emplace_back(std::move(npfs));
pfs_offset.emplace_back(offset);
}
}
}
for (size_t i = 0; i < pfs.size(); ++i) {
if (IsPfsExeFs(pfs[i]))
exefs_id = i;
}
return ResultStatus::Success;
}
FileSys::PartitionFilesystem Nca::GetPfs(u8 id) const {
return pfs[id];
}
u64 Nca::GetExeFsFileOffset(const std::string& file_name) const {
if (exefs_id == boost::none)
return 0;
return pfs[*exefs_id].GetFileOffset(file_name) + pfs_offset[*exefs_id];
}
u64 Nca::GetExeFsFileSize(const std::string& file_name) const {
if (exefs_id == boost::none)
return 0;
return pfs[*exefs_id].GetFileSize(file_name);
}
u64 Nca::GetRomFsOffset() const {
return romfs_offset;
}
u64 Nca::GetRomFsSize() const {
return romfs_size;
}
std::vector<u8> Nca::GetExeFsFile(const std::string& file_name) {
std::vector<u8> out(GetExeFsFileSize(file_name));
file.Seek(GetExeFsFileOffset(file_name), SEEK_SET);
file.ReadBytes(out.data(), GetExeFsFileSize(file_name));
return out;
}
AppLoader_NCA::AppLoader_NCA(FileUtil::IOFile&& file, std::string filepath)
: AppLoader(std::move(file)), filepath(std::move(filepath)) {}
FileType AppLoader_NCA::IdentifyType(FileUtil::IOFile& file, const std::string&) {
file.Seek(0, SEEK_SET);
std::array<u8, 0x400> header_enc_array{};
if (0x400 != file.ReadBytes(header_enc_array.data(), 0x400))
return FileType::Error;
// TODO(DarkLordZach): Assuming everything is decrypted. Add crypto support.
NcaHeader header{};
std::memcpy(&header, header_enc_array.data(), sizeof(NcaHeader));
if (IsValidNca(header) && header.content_type == NcaContentType::Program)
return FileType::NCA;
return FileType::Error;
}
ResultStatus AppLoader_NCA::Load(Kernel::SharedPtr<Kernel::Process>& process) {
if (is_loaded) {
return ResultStatus::ErrorAlreadyLoaded;
}
if (!file.IsOpen()) {
return ResultStatus::Error;
}
nca = std::make_unique<Nca>();
ResultStatus result = nca->Load(std::move(file), filepath);
if (result != ResultStatus::Success) {
return result;
}
result = metadata.Load(nca->GetExeFsFile("main.npdm"));
if (result != ResultStatus::Success) {
return result;
}
metadata.Print();
const FileSys::ProgramAddressSpaceType arch_bits{metadata.GetAddressSpaceType()};
if (arch_bits == FileSys::ProgramAddressSpaceType::Is32Bit) {
return ResultStatus::ErrorUnsupportedArch;
}
VAddr next_load_addr{Memory::PROCESS_IMAGE_VADDR};
for (const auto& module : {"rtld", "main", "subsdk0", "subsdk1", "subsdk2", "subsdk3",
"subsdk4", "subsdk5", "subsdk6", "subsdk7", "sdk"}) {
const VAddr load_addr = next_load_addr;
next_load_addr = AppLoader_NSO::LoadModule(module, nca->GetExeFsFile(module), load_addr);
if (next_load_addr) {
NGLOG_DEBUG(Loader, "loaded module {} @ 0x{:X}", module, load_addr);
} else {
next_load_addr = load_addr;
}
}
process->program_id = metadata.GetTitleID();
process->svc_access_mask.set();
process->address_mappings = default_address_mappings;
process->resource_limit =
Kernel::ResourceLimit::GetForCategory(Kernel::ResourceLimitCategory::APPLICATION);
process->Run(Memory::PROCESS_IMAGE_VADDR, metadata.GetMainThreadPriority(),
metadata.GetMainThreadStackSize());
if (nca->GetRomFsSize() > 0)
Service::FileSystem::RegisterFileSystem(std::make_unique<FileSys::RomFS_Factory>(*this),
Service::FileSystem::Type::RomFS);
is_loaded = true;
return ResultStatus::Success;
}
ResultStatus AppLoader_NCA::ReadRomFS(std::shared_ptr<FileUtil::IOFile>& romfs_file, u64& offset,
u64& size) {
if (nca->GetRomFsSize() == 0) {
NGLOG_DEBUG(Loader, "No RomFS available");
return ResultStatus::ErrorNotUsed;
}
romfs_file = std::make_shared<FileUtil::IOFile>(filepath, "rb");
offset = nca->GetRomFsOffset();
size = nca->GetRomFsSize();
NGLOG_DEBUG(Loader, "RomFS offset: 0x{:016X}", offset);
NGLOG_DEBUG(Loader, "RomFS size: 0x{:016X}", size);
return ResultStatus::Success;
}
AppLoader_NCA::~AppLoader_NCA() = default;
} // namespace Loader

49
src/core/loader/nca.h Normal file
View File

@@ -0,0 +1,49 @@
// Copyright 2018 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <string>
#include "common/common_types.h"
#include "core/file_sys/partition_filesystem.h"
#include "core/file_sys/program_metadata.h"
#include "core/hle/kernel/kernel.h"
#include "core/loader/loader.h"
namespace Loader {
class Nca;
/// Loads an NCA file
class AppLoader_NCA final : public AppLoader {
public:
AppLoader_NCA(FileUtil::IOFile&& file, std::string filepath);
/**
* Returns the type of the file
* @param file FileUtil::IOFile open file
* @param filepath Path of the file that we are opening.
* @return FileType found, or FileType::Error if this loader doesn't know it
*/
static FileType IdentifyType(FileUtil::IOFile& file, const std::string& filepath);
FileType GetFileType() override {
return IdentifyType(file, filepath);
}
ResultStatus Load(Kernel::SharedPtr<Kernel::Process>& process) override;
ResultStatus ReadRomFS(std::shared_ptr<FileUtil::IOFile>& romfs_file, u64& offset,
u64& size) override;
~AppLoader_NCA();
private:
std::string filepath;
FileSys::ProgramMetadata metadata;
std::unique_ptr<Nca> nca;
};
} // namespace Loader

View File

@@ -66,8 +66,22 @@ FileType AppLoader_NSO::IdentifyType(FileUtil::IOFile& file, const std::string&)
return FileType::Error;
}
static std::vector<u8> DecompressSegment(const std::vector<u8>& compressed_data,
const NsoSegmentHeader& header) {
std::vector<u8> uncompressed_data;
uncompressed_data.resize(header.size);
const int bytes_uncompressed = LZ4_decompress_safe(
reinterpret_cast<const char*>(compressed_data.data()),
reinterpret_cast<char*>(uncompressed_data.data()), compressed_data.size(), header.size);
ASSERT_MSG(bytes_uncompressed == header.size && bytes_uncompressed == uncompressed_data.size(),
"{} != {} != {}", bytes_uncompressed, header.size, uncompressed_data.size());
return uncompressed_data;
}
static std::vector<u8> ReadSegment(FileUtil::IOFile& file, const NsoSegmentHeader& header,
int compressed_size) {
size_t compressed_size) {
std::vector<u8> compressed_data;
compressed_data.resize(compressed_size);
@@ -77,22 +91,65 @@ static std::vector<u8> ReadSegment(FileUtil::IOFile& file, const NsoSegmentHeade
return {};
}
std::vector<u8> uncompressed_data;
uncompressed_data.resize(header.size);
const int bytes_uncompressed = LZ4_decompress_safe(
reinterpret_cast<const char*>(compressed_data.data()),
reinterpret_cast<char*>(uncompressed_data.data()), compressed_size, header.size);
ASSERT_MSG(bytes_uncompressed == header.size && bytes_uncompressed == uncompressed_data.size(),
"{} != {} != {}", bytes_uncompressed, header.size, uncompressed_data.size());
return uncompressed_data;
return DecompressSegment(compressed_data, header);
}
static constexpr u32 PageAlignSize(u32 size) {
return (size + Memory::PAGE_MASK) & ~Memory::PAGE_MASK;
}
VAddr AppLoader_NSO::LoadModule(const std::string& name, const std::vector<u8>& file_data,
VAddr load_base) {
if (file_data.size() < sizeof(NsoHeader))
return {};
NsoHeader nso_header;
std::memcpy(&nso_header, file_data.data(), sizeof(NsoHeader));
if (nso_header.magic != Common::MakeMagic('N', 'S', 'O', '0'))
return {};
// Build program image
Kernel::SharedPtr<Kernel::CodeSet> codeset = Kernel::CodeSet::Create("");
std::vector<u8> program_image;
for (int i = 0; i < nso_header.segments.size(); ++i) {
std::vector<u8> compressed_data(nso_header.segments_compressed_size[i]);
for (int j = 0; j < nso_header.segments_compressed_size[i]; ++j)
compressed_data[j] = file_data[nso_header.segments[i].offset + j];
std::vector<u8> data = DecompressSegment(compressed_data, nso_header.segments[i]);
program_image.resize(nso_header.segments[i].location);
program_image.insert(program_image.end(), data.begin(), data.end());
codeset->segments[i].addr = nso_header.segments[i].location;
codeset->segments[i].offset = nso_header.segments[i].location;
codeset->segments[i].size = PageAlignSize(static_cast<u32>(data.size()));
}
// MOD header pointer is at .text offset + 4
u32 module_offset;
std::memcpy(&module_offset, program_image.data() + 4, sizeof(u32));
// Read MOD header
ModHeader mod_header{};
// Default .bss to size in segment header if MOD0 section doesn't exist
u32 bss_size{PageAlignSize(nso_header.segments[2].bss_size)};
std::memcpy(&mod_header, program_image.data() + module_offset, sizeof(ModHeader));
const bool has_mod_header{mod_header.magic == Common::MakeMagic('M', 'O', 'D', '0')};
if (has_mod_header) {
// Resize program image to include .bss section and page align each section
bss_size = PageAlignSize(mod_header.bss_end_offset - mod_header.bss_start_offset);
}
codeset->data.size += bss_size;
const u32 image_size{PageAlignSize(static_cast<u32>(program_image.size()) + bss_size)};
program_image.resize(image_size);
// Load codeset for current process
codeset->name = name;
codeset->memory = std::make_shared<std::vector<u8>>(std::move(program_image));
Core::CurrentProcess()->LoadModule(codeset, load_base);
return load_base + image_size;
}
VAddr AppLoader_NSO::LoadModule(const std::string& path, VAddr load_base) {
FileUtil::IOFile file(path, "rb");
if (!file.IsOpen()) {

View File

@@ -29,6 +29,9 @@ public:
return IdentifyType(file, filepath);
}
static VAddr LoadModule(const std::string& name, const std::vector<u8>& file_data,
VAddr load_base);
static VAddr LoadModule(const std::string& path, VAddr load_base);
ResultStatus Load(Kernel::SharedPtr<Kernel::Process>& process) override;

View File

@@ -241,6 +241,10 @@ bool IsValidVirtualAddress(const VAddr vaddr) {
return IsValidVirtualAddress(*Core::CurrentProcess(), vaddr);
}
bool IsKernelVirtualAddress(const VAddr vaddr) {
return KERNEL_REGION_VADDR <= vaddr && vaddr < KERNEL_REGION_END;
}
bool IsValidPhysicalAddress(const PAddr paddr) {
return GetPhysicalPointer(paddr) != nullptr;
}

View File

@@ -188,6 +188,11 @@ enum : VAddr {
MAP_REGION_VADDR = NEW_MAP_REGION_VADDR_END,
MAP_REGION_SIZE = 0x1000000000,
MAP_REGION_VADDR_END = MAP_REGION_VADDR + MAP_REGION_SIZE,
/// Kernel Virtual Address Range
KERNEL_REGION_VADDR = 0xFFFFFF8000000000,
KERNEL_REGION_SIZE = 0x7FFFE00000,
KERNEL_REGION_END = KERNEL_REGION_VADDR + KERNEL_REGION_SIZE,
};
/// Currently active page table
@@ -197,6 +202,8 @@ PageTable* GetCurrentPageTable();
/// Determines if the given VAddr is valid for the specified process.
bool IsValidVirtualAddress(const Kernel::Process& process, const VAddr vaddr);
bool IsValidVirtualAddress(const VAddr addr);
/// Determines if the given VAddr is a kernel address
bool IsKernelVirtualAddress(const VAddr addr);
bool IsValidPhysicalAddress(const PAddr addr);

View File

@@ -129,6 +129,7 @@ struct Values {
// Renderer
float resolution_factor;
bool toggle_framelimit;
bool use_accurate_framebuffers;
float bg_red;
float bg_green;

View File

@@ -161,6 +161,8 @@ TelemetrySession::TelemetrySession() {
Settings::values.resolution_factor);
AddField(Telemetry::FieldType::UserConfig, "Renderer_ToggleFramelimit",
Settings::values.toggle_framelimit);
AddField(Telemetry::FieldType::UserConfig, "Renderer_UseAccurateFramebuffers",
Settings::values.use_accurate_framebuffers);
AddField(Telemetry::FieldType::UserConfig, "System_UseDockedMode",
Settings::values.use_docked_mode);
}

View File

@@ -41,6 +41,8 @@ add_library(video_core STATIC
renderer_opengl/maxwell_to_gl.h
renderer_opengl/renderer_opengl.cpp
renderer_opengl/renderer_opengl.h
textures/astc.cpp
textures/astc.h
textures/decoders.cpp
textures/decoders.h
textures/texture.h

View File

@@ -55,8 +55,10 @@ public:
virtual ~BreakPointObserver() {
auto context = context_weak.lock();
if (context) {
std::unique_lock<std::mutex> lock(context->breakpoint_mutex);
context->breakpoint_observers.remove(this);
{
std::unique_lock<std::mutex> lock(context->breakpoint_mutex);
context->breakpoint_observers.remove(this);
}
// If we are the last observer to be destroyed, tell the debugger context that
// it is free to continue. In particular, this is required for a proper yuzu

View File

@@ -328,8 +328,9 @@ std::vector<Texture::FullTextureInfo> Maxwell3D::GetStageTextures(Regs::ShaderSt
Texture::FullTextureInfo tex_info{};
// TODO(Subv): Use the shader to determine which textures are actually accessed.
tex_info.index = (current_texture - tex_info_buffer.address - TextureInfoOffset) /
sizeof(Texture::TextureHandle);
tex_info.index =
static_cast<u32>(current_texture - tex_info_buffer.address - TextureInfoOffset) /
sizeof(Texture::TextureHandle);
// Load the TIC data.
if (tex_handle.tic_id != 0) {

View File

@@ -321,6 +321,24 @@ public:
INSERT_PADDING_WORDS(1);
};
struct RenderTargetConfig {
u32 address_high;
u32 address_low;
u32 width;
u32 height;
Tegra::RenderTargetFormat format;
u32 block_dimensions;
u32 array_mode;
u32 layer_stride;
u32 base_layer;
INSERT_PADDING_WORDS(7);
GPUVAddr Address() const {
return static_cast<GPUVAddr>((static_cast<GPUVAddr>(address_high) << 32) |
address_low);
}
};
union {
struct {
INSERT_PADDING_WORDS(0x45);
@@ -333,23 +351,7 @@ public:
INSERT_PADDING_WORDS(0x1B8);
struct {
u32 address_high;
u32 address_low;
u32 width;
u32 height;
Tegra::RenderTargetFormat format;
u32 block_dimensions;
u32 array_mode;
u32 layer_stride;
u32 base_layer;
INSERT_PADDING_WORDS(7);
GPUVAddr Address() const {
return static_cast<GPUVAddr>((static_cast<GPUVAddr>(address_high) << 32) |
address_low);
}
} rt[NumRenderTargets];
RenderTargetConfig rt[NumRenderTargets];
struct {
f32 scale_x;
@@ -453,7 +455,11 @@ public:
u32 enable[NumRenderTargets];
} blend;
INSERT_PADDING_WORDS(0x77);
INSERT_PADDING_WORDS(0x2D);
u32 vb_element_base;
INSERT_PADDING_WORDS(0x49);
struct {
u32 tsc_address_high;
@@ -743,6 +749,7 @@ ASSERT_REG_POSITION(vertex_attrib_format[0], 0x458);
ASSERT_REG_POSITION(rt_control, 0x487);
ASSERT_REG_POSITION(independent_blend_enable, 0x4B9);
ASSERT_REG_POSITION(blend, 0x4CF);
ASSERT_REG_POSITION(vb_element_base, 0x50D);
ASSERT_REG_POSITION(tsc, 0x557);
ASSERT_REG_POSITION(tic, 0x55D);
ASSERT_REG_POSITION(code_address, 0x582);

View File

@@ -142,6 +142,7 @@ enum class PredCondition : u64 {
GreaterThan = 4,
NotEqual = 5,
GreaterEqual = 6,
NotEqualWithNan = 13,
// TODO(Subv): Other condition types
};
@@ -165,7 +166,6 @@ enum class SubOp : u64 {
Lg2 = 0x3,
Rcp = 0x4,
Rsq = 0x5,
Min = 0x8,
};
enum class F2iRoundingOp : u64 {
@@ -209,7 +209,7 @@ union Instruction {
} pred;
BitField<19, 1, u64> negate_pred;
BitField<20, 8, Register> gpr20;
BitField<20, 7, SubOp> sub_op;
BitField<20, 4, SubOp> sub_op;
BitField<28, 8, Register> gpr28;
BitField<39, 8, Register> gpr39;
BitField<48, 16, u64> opcode;
@@ -229,11 +229,19 @@ union Instruction {
BitField<42, 1, u64> negate_pred;
} fmnmx;
union {
BitField<39, 1, u64> invert_a;
BitField<40, 1, u64> invert_b;
BitField<41, 2, LogicOperation> operation;
BitField<44, 2, u64> unk44;
BitField<48, 3, Pred> pred48;
} lop;
union {
BitField<53, 2, LogicOperation> operation;
BitField<55, 1, u64> invert_a;
BitField<56, 1, u64> invert_b;
} lop;
} lop32i;
float GetImm20_19() const {
float result{};
@@ -343,7 +351,8 @@ union Instruction {
} iset;
union {
BitField<10, 2, Register::Size> size;
BitField<8, 2, Register::Size> dest_size;
BitField<10, 2, Register::Size> src_size;
BitField<12, 1, u64> is_output_signed;
BitField<13, 1, u64> is_input_signed;
BitField<41, 2, u64> selector;
@@ -363,7 +372,7 @@ union Instruction {
BitField<31, 4, u64> component_mask;
bool IsComponentEnabled(size_t component) const {
return ((1 << component) & component_mask) != 0;
return ((1ull << component) & component_mask) != 0;
}
} tex;
@@ -382,7 +391,7 @@ union Instruction {
ASSERT(component_mask_selector < mask.size());
return ((1 << component) & mask[component_mask_selector]) != 0;
return ((1ull << component) & mask[component_mask_selector]) != 0;
}
} texs;
@@ -475,6 +484,9 @@ public:
I2I_C,
I2I_R,
I2I_IMM,
LOP_C,
LOP_R,
LOP_IMM,
LOP32I,
MOV_C,
MOV_R,
@@ -514,10 +526,10 @@ public:
enum class Type {
Trivial,
Arithmetic,
ArithmeticImmediate,
ArithmeticInteger,
ArithmeticIntegerImmediate,
Bfe,
Logic,
Shift,
Ffma,
Flow,
@@ -644,7 +656,7 @@ private:
INST("0100110001101---", Id::FMUL_C, Type::Arithmetic, "FMUL_C"),
INST("0101110001101---", Id::FMUL_R, Type::Arithmetic, "FMUL_R"),
INST("0011100-01101---", Id::FMUL_IMM, Type::Arithmetic, "FMUL_IMM"),
INST("00011110--------", Id::FMUL32_IMM, Type::Arithmetic, "FMUL32_IMM"),
INST("00011110--------", Id::FMUL32_IMM, Type::ArithmeticImmediate, "FMUL32_IMM"),
INST("0100110000010---", Id::IADD_C, Type::ArithmeticInteger, "IADD_C"),
INST("0101110000010---", Id::IADD_R, Type::ArithmeticInteger, "IADD_R"),
INST("0011100-00010---", Id::IADD_IMM, Type::ArithmeticInteger, "IADD_IMM"),
@@ -665,7 +677,7 @@ private:
INST("0100110010011---", Id::MOV_C, Type::Arithmetic, "MOV_C"),
INST("0101110010011---", Id::MOV_R, Type::Arithmetic, "MOV_R"),
INST("0011100-10011---", Id::MOV_IMM, Type::Arithmetic, "MOV_IMM"),
INST("000000010000----", Id::MOV32_IMM, Type::Arithmetic, "MOV32_IMM"),
INST("000000010000----", Id::MOV32_IMM, Type::ArithmeticImmediate, "MOV32_IMM"),
INST("0100110001100---", Id::FMNMX_C, Type::Arithmetic, "FMNMX_C"),
INST("0101110001100---", Id::FMNMX_R, Type::Arithmetic, "FMNMX_R"),
INST("0011100-01100---", Id::FMNMX_IMM, Type::Arithmetic, "FMNMX_IMM"),
@@ -675,7 +687,10 @@ private:
INST("0100110000000---", Id::BFE_C, Type::Bfe, "BFE_C"),
INST("0101110000000---", Id::BFE_R, Type::Bfe, "BFE_R"),
INST("0011100-00000---", Id::BFE_IMM, Type::Bfe, "BFE_IMM"),
INST("000001----------", Id::LOP32I, Type::Logic, "LOP32I"),
INST("0100110001000---", Id::LOP_C, Type::ArithmeticInteger, "LOP_C"),
INST("0101110001000---", Id::LOP_R, Type::ArithmeticInteger, "LOP_R"),
INST("0011100001000---", Id::LOP_IMM, Type::ArithmeticInteger, "LOP_IMM"),
INST("000001----------", Id::LOP32I, Type::ArithmeticIntegerImmediate, "LOP32I"),
INST("0100110001001---", Id::SHL_C, Type::Shift, "SHL_C"),
INST("0101110001001---", Id::SHL_R, Type::Shift, "SHL_R"),
INST("0011100-01001---", Id::SHL_IMM, Type::Shift, "SHL_IMM"),

View File

@@ -16,6 +16,7 @@ namespace Tegra {
enum class RenderTargetFormat : u32 {
NONE = 0x0,
RGBA32_FLOAT = 0xC0,
RGBA32_UINT = 0xC2,
RGBA16_FLOAT = 0xCA,
RGB10_A2_UNORM = 0xD1,
RGBA8_UNORM = 0xD5,

View File

@@ -51,9 +51,8 @@ public:
}
/// Attempt to use a faster method to display the framebuffer to screen
virtual bool AccelerateDisplay(const Tegra::FramebufferConfig& framebuffer,
VAddr framebuffer_addr, u32 pixel_stride,
ScreenInfo& screen_info) {
virtual bool AccelerateDisplay(const Tegra::FramebufferConfig& config, VAddr framebuffer_addr,
u32 pixel_stride, ScreenInfo& screen_info) {
return false;
}

View File

@@ -146,7 +146,6 @@ std::pair<u8*, GLintptr> RasterizerOpenGL::SetupVertexArrays(u8* array_ptr,
u64 size = end - start + 1;
// Copy vertex array data
res_cache.FlushRegion(start, size, nullptr);
Memory::ReadBlock(*memory_manager->GpuToCpuAddress(start), array_ptr, size);
// Bind the vertex array to the buffer at the current offset.
@@ -197,8 +196,8 @@ void RasterizerOpenGL::SetupShaders(u8* buffer_ptr, GLintptr buffer_offset) {
ASSERT_MSG(!gpu.regs.shader_config[0].enable, "VertexA is unsupported!");
// Next available bindpoints to use when uploading the const buffers and textures to the GLSL
// shaders.
u32 current_constbuffer_bindpoint = 0;
// shaders. The constbuffer bindpoint starts after the shader stage configuration bind points.
u32 current_constbuffer_bindpoint = uniform_buffers.size();
u32 current_texture_bindpoint = 0;
for (unsigned index = 1; index < Maxwell::MaxShaderProgram; ++index) {
@@ -325,29 +324,22 @@ void RasterizerOpenGL::DrawArrays() {
std::tie(color_surface, depth_surface, surfaces_rect) =
res_cache.GetFramebufferSurfaces(using_color_fb, using_depth_fb, viewport_rect);
const u16 res_scale = color_surface != nullptr
? color_surface->res_scale
: (depth_surface == nullptr ? 1u : depth_surface->res_scale);
MathUtil::Rectangle<u32> draw_rect{
static_cast<u32>(std::clamp<s32>(static_cast<s32>(surfaces_rect.left) + viewport_rect.left,
surfaces_rect.left, surfaces_rect.right)), // Left
static_cast<u32>(std::clamp<s32>(static_cast<s32>(surfaces_rect.bottom) + viewport_rect.top,
surfaces_rect.bottom, surfaces_rect.top)), // Top
static_cast<u32>(std::clamp<s32>(static_cast<s32>(surfaces_rect.left) + viewport_rect.right,
surfaces_rect.left, surfaces_rect.right)), // Right
static_cast<u32>(
std::clamp<s32>(static_cast<s32>(surfaces_rect.left) + viewport_rect.left * res_scale,
surfaces_rect.left, surfaces_rect.right)), // Left
static_cast<u32>(
std::clamp<s32>(static_cast<s32>(surfaces_rect.bottom) + viewport_rect.top * res_scale,
surfaces_rect.bottom, surfaces_rect.top)), // Top
static_cast<u32>(
std::clamp<s32>(static_cast<s32>(surfaces_rect.left) + viewport_rect.right * res_scale,
surfaces_rect.left, surfaces_rect.right)), // Right
static_cast<u32>(std::clamp<s32>(static_cast<s32>(surfaces_rect.bottom) +
viewport_rect.bottom * res_scale,
surfaces_rect.bottom, surfaces_rect.top))}; // Bottom
std::clamp<s32>(static_cast<s32>(surfaces_rect.bottom) + viewport_rect.bottom,
surfaces_rect.bottom, surfaces_rect.top))}; // Bottom
// Bind the framebuffer surfaces
BindFramebufferSurfaces(color_surface, depth_surface, has_stencil);
// Sync the viewport
SyncViewport(surfaces_rect, res_scale);
SyncViewport(surfaces_rect);
// Sync the blend state registers
SyncBlendState();
@@ -420,14 +412,16 @@ void RasterizerOpenGL::DrawArrays() {
const GLenum primitive_mode{MaxwellToGL::PrimitiveTopology(regs.draw.topology)};
if (is_indexed) {
const GLint index_min{static_cast<GLint>(regs.index_array.first)};
const GLint index_max{static_cast<GLint>(regs.index_array.first + regs.index_array.count)};
glDrawRangeElementsBaseVertex(primitive_mode, index_min, index_max, regs.index_array.count,
MaxwellToGL::IndexFormat(regs.index_array.format),
reinterpret_cast<const void*>(index_buffer_offset),
-index_min);
const GLint base_vertex{static_cast<GLint>(regs.vb_element_base)};
// Adjust the index buffer offset so it points to the first desired index.
index_buffer_offset += regs.index_array.first * regs.index_array.FormatSizeInBytes();
glDrawElementsBaseVertex(primitive_mode, regs.index_array.count,
MaxwellToGL::IndexFormat(regs.index_array.format),
reinterpret_cast<const void*>(index_buffer_offset), base_vertex);
} else {
glDrawArrays(primitive_mode, 0, regs.vertex_buffer.count);
glDrawArrays(primitive_mode, regs.vertex_buffer.first, regs.vertex_buffer.count);
}
// Disable scissor test
@@ -437,24 +431,16 @@ void RasterizerOpenGL::DrawArrays() {
// Unbind textures for potential future use as framebuffer attachments
for (auto& texture_unit : state.texture_units) {
texture_unit.texture_2d = 0;
texture_unit.Unbind();
}
state.Apply();
// Mark framebuffer surfaces as dirty
MathUtil::Rectangle<u32> draw_rect_unscaled{
draw_rect.left / res_scale, draw_rect.top / res_scale, draw_rect.right / res_scale,
draw_rect.bottom / res_scale};
if (color_surface != nullptr && write_color_fb) {
auto interval = color_surface->GetSubRectInterval(draw_rect_unscaled);
res_cache.InvalidateRegion(boost::icl::first(interval), boost::icl::length(interval),
color_surface);
res_cache.MarkSurfaceAsDirty(color_surface);
}
if (depth_surface != nullptr && write_depth_fb) {
auto interval = depth_surface->GetSubRectInterval(draw_rect_unscaled);
res_cache.InvalidateRegion(boost::icl::first(interval), boost::icl::length(interval),
depth_surface);
res_cache.MarkSurfaceAsDirty(depth_surface);
}
}
@@ -462,7 +448,7 @@ void RasterizerOpenGL::NotifyMaxwellRegisterChanged(u32 method) {}
void RasterizerOpenGL::FlushAll() {
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
res_cache.FlushAll();
res_cache.FlushRegion(0, Kernel::VMManager::MAX_ADDRESS);
}
void RasterizerOpenGL::FlushRegion(Tegra::GPUVAddr addr, u64 size) {
@@ -472,13 +458,13 @@ void RasterizerOpenGL::FlushRegion(Tegra::GPUVAddr addr, u64 size) {
void RasterizerOpenGL::InvalidateRegion(Tegra::GPUVAddr addr, u64 size) {
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
res_cache.InvalidateRegion(addr, size, nullptr);
res_cache.InvalidateRegion(addr, size);
}
void RasterizerOpenGL::FlushAndInvalidateRegion(Tegra::GPUVAddr addr, u64 size) {
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
res_cache.FlushRegion(addr, size);
res_cache.InvalidateRegion(addr, size, nullptr);
res_cache.InvalidateRegion(addr, size);
}
bool RasterizerOpenGL::AccelerateDisplayTransfer(const void* config) {
@@ -497,45 +483,28 @@ bool RasterizerOpenGL::AccelerateFill(const void* config) {
return true;
}
bool RasterizerOpenGL::AccelerateDisplay(const Tegra::FramebufferConfig& framebuffer,
bool RasterizerOpenGL::AccelerateDisplay(const Tegra::FramebufferConfig& config,
VAddr framebuffer_addr, u32 pixel_stride,
ScreenInfo& screen_info) {
if (framebuffer_addr == 0) {
return false;
if (!framebuffer_addr) {
return {};
}
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
SurfaceParams src_params;
src_params.cpu_addr = framebuffer_addr;
src_params.addr = res_cache.TryFindFramebufferGpuAddress(framebuffer_addr).get_value_or(0);
src_params.width = std::min(framebuffer.width, pixel_stride);
src_params.height = framebuffer.height;
src_params.stride = pixel_stride;
src_params.is_tiled = true;
src_params.block_height = Tegra::Texture::TICEntry::DefaultBlockHeight;
src_params.pixel_format =
SurfaceParams::PixelFormatFromGPUPixelFormat(framebuffer.pixel_format);
src_params.component_type =
SurfaceParams::ComponentTypeFromGPUPixelFormat(framebuffer.pixel_format);
src_params.UpdateParams();
MathUtil::Rectangle<u32> src_rect;
Surface src_surface;
std::tie(src_surface, src_rect) =
res_cache.GetSurfaceSubRect(src_params, ScaleMatch::Ignore, true);
if (src_surface == nullptr) {
return false;
const auto& surface{res_cache.TryFindFramebufferSurface(framebuffer_addr)};
if (!surface) {
return {};
}
u32 scaled_width = src_surface->GetScaledWidth();
u32 scaled_height = src_surface->GetScaledHeight();
// Verify that the cached surface is the same size and format as the requested framebuffer
const auto& params{surface->GetSurfaceParams()};
const auto& pixel_format{SurfaceParams::PixelFormatFromGPUPixelFormat(config.pixel_format)};
ASSERT_MSG(params.width == config.width, "Framebuffer width is different");
ASSERT_MSG(params.height == config.height, "Framebuffer height is different");
ASSERT_MSG(params.pixel_format == pixel_format, "Framebuffer pixel_format is different");
screen_info.display_texcoords = MathUtil::Rectangle<float>(
(float)src_rect.bottom / (float)scaled_height, (float)src_rect.left / (float)scaled_width,
(float)src_rect.top / (float)scaled_height, (float)src_rect.right / (float)scaled_width);
screen_info.display_texture = src_surface->texture.handle;
screen_info.display_texture = surface->Texture().handle;
return true;
}
@@ -608,32 +577,44 @@ u32 RasterizerOpenGL::SetupConstBuffers(Maxwell::ShaderStage stage, GLuint progr
boost::optional<VAddr> addr = gpu.memory_manager->GpuToCpuAddress(buffer.address);
std::vector<u8> data;
size_t size = 0;
if (used_buffer.IsIndirect()) {
// Buffer is accessed indirectly, so upload the entire thing
data.resize(buffer.size * sizeof(float));
size = buffer.size * sizeof(float);
if (size > MaxConstbufferSize) {
NGLOG_ERROR(HW_GPU, "indirect constbuffer size {} exceeds maximum {}", size,
MaxConstbufferSize);
size = MaxConstbufferSize;
}
} else {
// Buffer is accessed directly, upload just what we use
data.resize(used_buffer.GetSize() * sizeof(float));
size = used_buffer.GetSize() * sizeof(float);
}
// Align the actual size so it ends up being a multiple of vec4 to meet the OpenGL std140
// UBO alignment requirements.
size = Common::AlignUp(size, sizeof(GLvec4));
ASSERT_MSG(size <= MaxConstbufferSize, "Constbuffer too big");
std::vector<u8> data(size);
Memory::ReadBlock(*addr, data.data(), data.size());
glBindBuffer(GL_SHADER_STORAGE_BUFFER, buffer_draw_state.ssbo);
glBufferData(GL_SHADER_STORAGE_BUFFER, data.size(), data.data(), GL_DYNAMIC_DRAW);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, 0);
glBindBuffer(GL_UNIFORM_BUFFER, buffer_draw_state.ssbo);
glBufferData(GL_UNIFORM_BUFFER, data.size(), data.data(), GL_DYNAMIC_DRAW);
glBindBuffer(GL_UNIFORM_BUFFER, 0);
// Now configure the bindpoint of the buffer inside the shader
std::string buffer_name = used_buffer.GetName();
GLuint index =
glGetProgramResourceIndex(program, GL_SHADER_STORAGE_BLOCK, buffer_name.c_str());
GLuint index = glGetProgramResourceIndex(program, GL_UNIFORM_BLOCK, buffer_name.c_str());
if (index != -1)
glShaderStorageBlockBinding(program, index, buffer_draw_state.bindpoint);
glUniformBlockBinding(program, index, buffer_draw_state.bindpoint);
}
state.Apply();
return current_bindpoint + entries.size();
return current_bindpoint + static_cast<u32>(entries.size());
}
u32 RasterizerOpenGL::SetupTextures(Maxwell::ShaderStage stage, GLuint program, u32 current_unit,
@@ -662,7 +643,7 @@ u32 RasterizerOpenGL::SetupTextures(Maxwell::ShaderStage stage, GLuint program,
texture_samplers[current_bindpoint].SyncWithConfig(texture.tsc);
Surface surface = res_cache.GetTextureSurface(texture);
if (surface != nullptr) {
state.texture_units[current_bindpoint].texture_2d = surface->texture.handle;
state.texture_units[current_bindpoint].texture_2d = surface->Texture().handle;
state.texture_units[current_bindpoint].swizzle.r =
MaxwellToGL::SwizzleSource(texture.tic.x_source);
state.texture_units[current_bindpoint].swizzle.g =
@@ -679,7 +660,7 @@ u32 RasterizerOpenGL::SetupTextures(Maxwell::ShaderStage stage, GLuint program,
state.Apply();
return current_unit + entries.size();
return current_unit + static_cast<u32>(entries.size());
}
void RasterizerOpenGL::BindFramebufferSurfaces(const Surface& color_surface,
@@ -688,16 +669,16 @@ void RasterizerOpenGL::BindFramebufferSurfaces(const Surface& color_surface,
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
color_surface != nullptr ? color_surface->texture.handle : 0, 0);
color_surface != nullptr ? color_surface->Texture().handle : 0, 0);
if (depth_surface != nullptr) {
if (has_stencil) {
// attach both depth and stencil
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D,
depth_surface->texture.handle, 0);
depth_surface->Texture().handle, 0);
} else {
// attach depth
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D,
depth_surface->texture.handle, 0);
depth_surface->Texture().handle, 0);
// clear stencil attachment
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
}
@@ -708,14 +689,14 @@ void RasterizerOpenGL::BindFramebufferSurfaces(const Surface& color_surface,
}
}
void RasterizerOpenGL::SyncViewport(const MathUtil::Rectangle<u32>& surfaces_rect, u16 res_scale) {
void RasterizerOpenGL::SyncViewport(const MathUtil::Rectangle<u32>& surfaces_rect) {
const auto& regs = Core::System().GetInstance().GPU().Maxwell3D().regs;
const MathUtil::Rectangle<s32> viewport_rect{regs.viewport_transform[0].GetRect()};
state.viewport.x = static_cast<GLint>(surfaces_rect.left) + viewport_rect.left * res_scale;
state.viewport.y = static_cast<GLint>(surfaces_rect.bottom) + viewport_rect.bottom * res_scale;
state.viewport.width = static_cast<GLsizei>(viewport_rect.GetWidth() * res_scale);
state.viewport.height = static_cast<GLsizei>(viewport_rect.GetHeight() * res_scale);
state.viewport.x = static_cast<GLint>(surfaces_rect.left) + viewport_rect.left;
state.viewport.y = static_cast<GLint>(surfaces_rect.bottom) + viewport_rect.bottom;
state.viewport.width = static_cast<GLsizei>(viewport_rect.GetWidth());
state.viewport.height = static_cast<GLsizei>(viewport_rect.GetHeight());
}
void RasterizerOpenGL::SyncClipEnabled() {
@@ -740,7 +721,6 @@ void RasterizerOpenGL::SyncDepthOffset() {
void RasterizerOpenGL::SyncBlendState() {
const auto& regs = Core::System().GetInstance().GPU().Maxwell3D().regs;
ASSERT_MSG(regs.independent_blend_enable == 1, "Only independent blending is implemented");
// TODO(Subv): Support more than just render target 0.
state.blend.enabled = regs.blend.enable[0] != 0;
@@ -748,6 +728,7 @@ void RasterizerOpenGL::SyncBlendState() {
if (!state.blend.enabled)
return;
ASSERT_MSG(regs.independent_blend_enable == 1, "Only independent blending is implemented");
ASSERT_MSG(!regs.independent_blend[0].separate_alpha, "Unimplemented");
state.blend.rgb_equation = MaxwellToGL::BlendEquation(regs.independent_blend[0].equation_rgb);
state.blend.src_rgb_func = MaxwellToGL::BlendFunc(regs.independent_blend[0].factor_source_rgb);

View File

@@ -54,6 +54,11 @@ public:
OGLShader shader;
};
/// Maximum supported size that a constbuffer can have in bytes.
static constexpr size_t MaxConstbufferSize = 0x10000;
static_assert(MaxConstbufferSize % sizeof(GLvec4) == 0,
"The maximum size of a constbuffer must be a multiple of the size of GLvec4");
private:
class SamplerInfo {
public:
@@ -104,7 +109,7 @@ private:
u32 current_unit, const std::vector<GLShader::SamplerEntry>& entries);
/// Syncs the viewport to match the guest state
void SyncViewport(const MathUtil::Rectangle<u32>& surfaces_rect, u16 res_scale);
void SyncViewport(const MathUtil::Rectangle<u32>& surfaces_rect);
/// Syncs the clip enabled status to match the guest state
void SyncClipEnabled();

File diff suppressed because it is too large Load Diff

View File

@@ -1,57 +1,26 @@
// Copyright 2015 Citra Emulator Project
// Copyright 2018 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <array>
#include <map>
#include <memory>
#include <set>
#include <tuple>
#ifdef __GNUC__
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-local-typedefs"
#endif
#include <vector>
#include <boost/icl/interval_map.hpp>
#include <boost/icl/interval_set.hpp>
#ifdef __GNUC__
#pragma GCC diagnostic pop
#endif
#include <boost/optional.hpp>
#include <glad/glad.h>
#include "common/assert.h"
#include "common/common_funcs.h"
#include "common/common_types.h"
#include "common/hash.h"
#include "common/math_util.h"
#include "video_core/gpu.h"
#include "video_core/memory_manager.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/textures/texture.h"
struct CachedSurface;
class CachedSurface;
using Surface = std::shared_ptr<CachedSurface>;
using SurfaceSet = std::set<Surface>;
using SurfaceRegions = boost::icl::interval_set<Tegra::GPUVAddr>;
using SurfaceMap = boost::icl::interval_map<Tegra::GPUVAddr, Surface>;
using SurfaceCache = boost::icl::interval_map<Tegra::GPUVAddr, SurfaceSet>;
using SurfaceInterval = SurfaceCache::interval_type;
static_assert(std::is_same<SurfaceRegions::interval_type, SurfaceCache::interval_type>() &&
std::is_same<SurfaceMap::interval_type, SurfaceCache::interval_type>(),
"incorrect interval types");
using SurfaceRect_Tuple = std::tuple<Surface, MathUtil::Rectangle<u32>>;
using SurfaceSurfaceRect_Tuple = std::tuple<Surface, Surface, MathUtil::Rectangle<u32>>;
using PageMap = boost::icl::interval_map<u64, int>;
enum class ScaleMatch {
Exact, // only accept same res scale
Upscale, // only allow higher scale than params
Ignore // accept every scaled res
};
struct SurfaceParams {
enum class PixelFormat {
ABGR8 = 0,
@@ -61,10 +30,12 @@ struct SurfaceParams {
R8 = 4,
RGBA16F = 5,
R11FG11FB10F = 6,
DXT1 = 7,
DXT23 = 8,
DXT45 = 9,
DXN1 = 10, // This is also known as BC4
RGBA32UI = 7,
DXT1 = 8,
DXT23 = 9,
DXT45 = 10,
DXN1 = 11, // This is also known as BC4
ASTC_2D_4X4 = 12,
Max,
Invalid = 255,
@@ -92,10 +63,10 @@ struct SurfaceParams {
/**
* Gets the compression factor for the specified PixelFormat. This applies to just the
* "compressed width" and "compressed height", not the overall compression factor of a
* compressed image. This is used for maintaining proper surface sizes for compressed texture
* formats.
* compressed image. This is used for maintaining proper surface sizes for compressed
* texture formats.
*/
static constexpr u32 GetCompresssionFactor(PixelFormat format) {
static constexpr u32 GetCompressionFactor(PixelFormat format) {
if (format == PixelFormat::Invalid)
return 0;
@@ -107,18 +78,17 @@ struct SurfaceParams {
1, // R8
1, // RGBA16F
1, // R11FG11FB10F
1, // RGBA32UI
4, // DXT1
4, // DXT23
4, // DXT45
4, // DXN1
4, // ASTC_2D_4X4
}};
ASSERT(static_cast<size_t>(format) < compression_factor_table.size());
return compression_factor_table[static_cast<size_t>(format)];
}
u32 GetCompresssionFactor() const {
return GetCompresssionFactor(pixel_format);
}
static constexpr u32 GetFormatBpp(PixelFormat format) {
if (format == PixelFormat::Invalid)
@@ -132,10 +102,12 @@ struct SurfaceParams {
8, // R8
64, // RGBA16F
32, // R11FG11FB10F
128, // RGBA32UI
64, // DXT1
128, // DXT23
128, // DXT45
64, // DXN1
32, // ASTC_2D_4X4
}};
ASSERT(static_cast<size_t>(format) < bpp_table.size());
@@ -156,16 +128,8 @@ struct SurfaceParams {
return PixelFormat::RGBA16F;
case Tegra::RenderTargetFormat::R11G11B10_FLOAT:
return PixelFormat::R11FG11FB10F;
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented format={}", static_cast<u32>(format));
UNREACHABLE();
}
}
static PixelFormat PixelFormatFromGPUPixelFormat(Tegra::FramebufferConfig::PixelFormat format) {
switch (format) {
case Tegra::FramebufferConfig::PixelFormat::ABGR8:
return PixelFormat::ABGR8;
case Tegra::RenderTargetFormat::RGBA32_UINT:
return PixelFormat::RGBA32UI;
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented format={}", static_cast<u32>(format));
UNREACHABLE();
@@ -189,6 +153,8 @@ struct SurfaceParams {
return PixelFormat::RGBA16F;
case Tegra::Texture::TextureFormat::BF10GF11RF11:
return PixelFormat::R11FG11FB10F;
case Tegra::Texture::TextureFormat::R32_G32_B32_A32:
return PixelFormat::RGBA32UI;
case Tegra::Texture::TextureFormat::DXT1:
return PixelFormat::DXT1;
case Tegra::Texture::TextureFormat::DXT23:
@@ -197,6 +163,8 @@ struct SurfaceParams {
return PixelFormat::DXT45;
case Tegra::Texture::TextureFormat::DXN1:
return PixelFormat::DXN1;
case Tegra::Texture::TextureFormat::ASTC_2D_4X4:
return PixelFormat::ASTC_2D_4X4;
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented format={}", static_cast<u32>(format));
UNREACHABLE();
@@ -220,6 +188,8 @@ struct SurfaceParams {
return Tegra::Texture::TextureFormat::R16_G16_B16_A16;
case PixelFormat::R11FG11FB10F:
return Tegra::Texture::TextureFormat::BF10GF11RF11;
case PixelFormat::RGBA32UI:
return Tegra::Texture::TextureFormat::R32_G32_B32_A32;
case PixelFormat::DXT1:
return Tegra::Texture::TextureFormat::DXT1;
case PixelFormat::DXT23:
@@ -228,6 +198,8 @@ struct SurfaceParams {
return Tegra::Texture::TextureFormat::DXT45;
case PixelFormat::DXN1:
return Tegra::Texture::TextureFormat::DXN1;
case PixelFormat::ASTC_2D_4X4:
return Tegra::Texture::TextureFormat::ASTC_2D_4X4;
default:
UNREACHABLE();
}
@@ -254,42 +226,24 @@ struct SurfaceParams {
case Tegra::RenderTargetFormat::RGBA16_FLOAT:
case Tegra::RenderTargetFormat::R11G11B10_FLOAT:
return ComponentType::Float;
case Tegra::RenderTargetFormat::RGBA32_UINT:
return ComponentType::UInt;
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented format={}", static_cast<u32>(format));
UNREACHABLE();
}
}
static ComponentType ComponentTypeFromGPUPixelFormat(
Tegra::FramebufferConfig::PixelFormat format) {
static PixelFormat PixelFormatFromGPUPixelFormat(Tegra::FramebufferConfig::PixelFormat format) {
switch (format) {
case Tegra::FramebufferConfig::PixelFormat::ABGR8:
return ComponentType::UNorm;
return PixelFormat::ABGR8;
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented format={}", static_cast<u32>(format));
UNREACHABLE();
}
}
static bool CheckFormatsBlittable(PixelFormat pixel_format_a, PixelFormat pixel_format_b) {
SurfaceType a_type = GetFormatType(pixel_format_a);
SurfaceType b_type = GetFormatType(pixel_format_b);
if (a_type == SurfaceType::ColorTexture && b_type == SurfaceType::ColorTexture) {
return true;
}
if (a_type == SurfaceType::Depth && b_type == SurfaceType::Depth) {
return true;
}
if (a_type == SurfaceType::DepthStencil && b_type == SurfaceType::DepthStencil) {
return true;
}
return false;
}
static SurfaceType GetFormatType(PixelFormat pixel_format) {
if (static_cast<size_t>(pixel_format) < MaxPixelFormat) {
return SurfaceType::ColorTexture;
@@ -301,168 +255,101 @@ struct SurfaceParams {
return SurfaceType::Invalid;
}
/// Update the params "size", "end" and "type" from the already set "addr", "width", "height"
/// and "pixel_format"
void UpdateParams() {
if (stride == 0) {
stride = width;
}
type = GetFormatType(pixel_format);
size = !is_tiled ? BytesInPixels(stride * (height - 1) + width)
: BytesInPixels(stride * 8 * (height / 8 - 1) + width * 8);
end = addr + size;
}
SurfaceInterval GetInterval() const {
return SurfaceInterval::right_open(addr, end);
}
// Returns the outer rectangle containing "interval"
SurfaceParams FromInterval(SurfaceInterval interval) const;
SurfaceInterval GetSubRectInterval(MathUtil::Rectangle<u32> unscaled_rect) const;
// Returns the region of the biggest valid rectange within interval
SurfaceInterval GetCopyableInterval(const Surface& src_surface) const;
/**
* Gets the actual width (in pixels) of the surface. This is provided because `width` is used
* for tracking the surface region in memory, which may be compressed for certain formats. In
* this scenario, `width` is actually the compressed width.
*/
u32 GetActualWidth() const {
return width * GetCompresssionFactor();
}
/**
* Gets the actual height (in pixels) of the surface. This is provided because `height` is used
* for tracking the surface region in memory, which may be compressed for certain formats. In
* this scenario, `height` is actually the compressed height.
*/
u32 GetActualHeight() const {
return height * GetCompresssionFactor();
}
u32 GetScaledWidth() const {
return width * res_scale;
}
u32 GetScaledHeight() const {
return height * res_scale;
}
MathUtil::Rectangle<u32> GetRect() const {
return {0, height, width, 0};
}
MathUtil::Rectangle<u32> GetScaledRect() const {
return {0, GetScaledHeight(), GetScaledWidth(), 0};
}
u64 PixelsInBytes(u64 size) const {
return size * CHAR_BIT / GetFormatBpp(pixel_format);
}
u64 BytesInPixels(u64 pixels) const {
return pixels * GetFormatBpp(pixel_format) / CHAR_BIT;
/// Returns the rectangle corresponding to this surface
MathUtil::Rectangle<u32> GetRect() const;
/// Returns the size of this surface in bytes, adjusted for compression
size_t SizeInBytes() const {
const u32 compression_factor{GetCompressionFactor(pixel_format)};
ASSERT(width % compression_factor == 0);
ASSERT(height % compression_factor == 0);
return (width / compression_factor) * (height / compression_factor) *
GetFormatBpp(pixel_format) / CHAR_BIT;
}
/// Returns the CPU virtual address for this surface
VAddr GetCpuAddr() const;
bool ExactMatch(const SurfaceParams& other_surface) const;
bool CanSubRect(const SurfaceParams& sub_surface) const;
bool CanExpand(const SurfaceParams& expanded_surface) const;
bool CanTexCopy(const SurfaceParams& texcopy_params) const;
/// Returns true if the specified region overlaps with this surface's region in Switch memory
bool IsOverlappingRegion(Tegra::GPUVAddr region_addr, size_t region_size) const {
return addr <= (region_addr + region_size) && region_addr <= (addr + size_in_bytes);
}
MathUtil::Rectangle<u32> GetSubRect(const SurfaceParams& sub_surface) const;
MathUtil::Rectangle<u32> GetScaledSubRect(const SurfaceParams& sub_surface) const;
/// Creates SurfaceParams from a texture configation
static SurfaceParams CreateForTexture(const Tegra::Texture::FullTextureInfo& config);
Tegra::GPUVAddr addr = 0;
Tegra::GPUVAddr end = 0;
boost::optional<VAddr> cpu_addr;
u64 size = 0;
/// Creates SurfaceParams from a framebuffer configation
static SurfaceParams CreateForFramebuffer(
const Tegra::Engines::Maxwell3D::Regs::RenderTargetConfig& config);
u32 width = 0;
u32 height = 0;
u32 stride = 0;
u32 block_height = 0;
u16 res_scale = 1;
bool is_tiled = false;
PixelFormat pixel_format = PixelFormat::Invalid;
SurfaceType type = SurfaceType::Invalid;
ComponentType component_type = ComponentType::Invalid;
Tegra::GPUVAddr addr;
bool is_tiled;
u32 block_height;
PixelFormat pixel_format;
ComponentType component_type;
SurfaceType type;
u32 width;
u32 height;
u32 unaligned_height;
size_t size_in_bytes;
};
struct CachedSurface : SurfaceParams {
bool CanFill(const SurfaceParams& dest_surface, SurfaceInterval fill_interval) const;
bool CanCopy(const SurfaceParams& dest_surface, SurfaceInterval copy_interval) const;
/// Hashable variation of SurfaceParams, used for a key in the surface cache
struct SurfaceKey : Common::HashableStruct<SurfaceParams> {
static SurfaceKey Create(const SurfaceParams& params) {
SurfaceKey res;
res.state = params;
return res;
}
};
bool IsRegionValid(SurfaceInterval interval) const {
return (invalid_regions.find(interval) == invalid_regions.end());
namespace std {
template <>
struct hash<SurfaceKey> {
size_t operator()(const SurfaceKey& k) const {
return k.Hash();
}
};
} // namespace std
class CachedSurface final {
public:
CachedSurface(const SurfaceParams& params);
const OGLTexture& Texture() const {
return texture;
}
bool IsSurfaceFullyInvalid() const {
return (invalid_regions & GetInterval()) == SurfaceRegions(GetInterval());
}
bool registered = false;
SurfaceRegions invalid_regions;
u64 fill_size = 0; /// Number of bytes to read from fill_data
std::array<u8, 4> fill_data;
OGLTexture texture;
static constexpr unsigned int GetGLBytesPerPixel(PixelFormat format) {
if (format == PixelFormat::Invalid)
static constexpr unsigned int GetGLBytesPerPixel(SurfaceParams::PixelFormat format) {
if (format == SurfaceParams::PixelFormat::Invalid)
return 0;
return SurfaceParams::GetFormatBpp(format) / CHAR_BIT;
}
std::unique_ptr<u8[]> gl_buffer;
size_t gl_buffer_size = 0;
const SurfaceParams& GetSurfaceParams() const {
return params;
}
// Read/Write data in Switch memory to/from gl_buffer
void LoadGLBuffer(Tegra::GPUVAddr load_start, Tegra::GPUVAddr load_end);
void FlushGLBuffer(Tegra::GPUVAddr flush_start, Tegra::GPUVAddr flush_end);
void LoadGLBuffer();
void FlushGLBuffer();
// Upload/Download data in gl_buffer in/to this surface's texture
void UploadGLTexture(const MathUtil::Rectangle<u32>& rect, GLuint read_fb_handle,
GLuint draw_fb_handle);
void DownloadGLTexture(const MathUtil::Rectangle<u32>& rect, GLuint read_fb_handle,
GLuint draw_fb_handle);
void UploadGLTexture(GLuint read_fb_handle, GLuint draw_fb_handle);
void DownloadGLTexture(GLuint read_fb_handle, GLuint draw_fb_handle);
private:
OGLTexture texture;
std::vector<u8> gl_buffer;
SurfaceParams params;
};
class RasterizerCacheOpenGL : NonCopyable {
class RasterizerCacheOpenGL final : NonCopyable {
public:
RasterizerCacheOpenGL();
~RasterizerCacheOpenGL();
/// Blit one surface's texture to another
bool BlitSurfaces(const Surface& src_surface, const MathUtil::Rectangle<u32>& src_rect,
const Surface& dst_surface, const MathUtil::Rectangle<u32>& dst_rect);
void ConvertD24S8toABGR(GLuint src_tex, const MathUtil::Rectangle<u32>& src_rect,
GLuint dst_tex, const MathUtil::Rectangle<u32>& dst_rect);
/// Copy one surface's region to another
void CopySurface(const Surface& src_surface, const Surface& dst_surface,
SurfaceInterval copy_interval);
/// Load a texture from Switch memory to OpenGL and cache it (if not already cached)
Surface GetSurface(const SurfaceParams& params, ScaleMatch match_res_scale,
bool load_if_create);
/// Tries to find a framebuffer GPU address based on the provided CPU address
boost::optional<Tegra::GPUVAddr> TryFindFramebufferGpuAddress(VAddr cpu_addr) const;
/// Attempt to find a subrect (resolution scaled) of a surface, otherwise loads a texture from
/// Switch memory to OpenGL and caches it (if not already cached)
SurfaceRect_Tuple GetSurfaceSubRect(const SurfaceParams& params, ScaleMatch match_res_scale,
bool load_if_create);
/// Get a surface based on the texture configuration
Surface GetTextureSurface(const Tegra::Texture::FullTextureInfo& config);
@@ -470,29 +357,21 @@ public:
SurfaceSurfaceRect_Tuple GetFramebufferSurfaces(bool using_color_fb, bool using_depth_fb,
const MathUtil::Rectangle<s32>& viewport);
/// Get a surface that matches the fill config
Surface GetFillSurface(const void* config);
/// Marks the specified surface as "dirty", in that it is out of sync with Switch memory
void MarkSurfaceAsDirty(const Surface& surface);
/// Get a surface that matches a "texture copy" display transfer config
SurfaceRect_Tuple GetTexCopySurface(const SurfaceParams& params);
/// Tries to find a framebuffer GPU address based on the provided CPU address
Surface TryFindFramebufferSurface(VAddr cpu_addr) const;
/// Write any cached resources overlapping the region back to memory (if dirty)
void FlushRegion(Tegra::GPUVAddr addr, u64 size, Surface flush_surface = nullptr);
void FlushRegion(Tegra::GPUVAddr addr, size_t size);
/// Mark region as being invalidated by region_owner (nullptr if Switch memory)
void InvalidateRegion(Tegra::GPUVAddr addr, u64 size, const Surface& region_owner);
/// Flush all cached resources tracked by this cache manager
void FlushAll();
/// Mark the specified region as being invalidated
void InvalidateRegion(Tegra::GPUVAddr addr, size_t size);
private:
void DuplicateSurface(const Surface& src_surface, const Surface& dest_surface);
/// Update surface's texture for given region when necessary
void ValidateSurface(const Surface& surface, Tegra::GPUVAddr addr, u64 size);
/// Create a new surface
Surface CreateSurface(const SurfaceParams& params);
void LoadSurface(const Surface& surface);
Surface GetSurface(const SurfaceParams& params);
/// Register surface into the cache
void RegisterSurface(const Surface& surface);
@@ -503,18 +382,9 @@ private:
/// Increase/decrease the number of surface in pages touching the specified region
void UpdatePagesCachedCount(Tegra::GPUVAddr addr, u64 size, int delta);
SurfaceCache surface_cache;
std::unordered_map<SurfaceKey, Surface> surface_cache;
PageMap cached_pages;
SurfaceMap dirty_regions;
SurfaceSet remove_surfaces;
OGLFramebuffer read_framebuffer;
OGLFramebuffer draw_framebuffer;
OGLVertexArray attributeless_vao;
OGLBuffer d24s8_abgr_buffer;
GLsizeiptr d24s8_abgr_buffer_size;
OGLProgram d24s8_abgr_shader;
GLint d24s8_abgr_tbo_size_u_id;
GLint d24s8_abgr_viewport_u_id;
};

View File

@@ -38,7 +38,7 @@ public:
if (handle == 0)
return;
glDeleteTextures(1, &handle);
OpenGLState::GetCurState().ResetTexture(handle).Apply();
OpenGLState::GetCurState().UnbindTexture(handle).Apply();
handle = 0;
}

View File

@@ -9,6 +9,7 @@
#include "common/assert.h"
#include "common/common_types.h"
#include "video_core/engines/shader_bytecode.h"
#include "video_core/renderer_opengl/gl_rasterizer.h"
#include "video_core/renderer_opengl/gl_shader_decompiler.h"
namespace GLShader {
@@ -16,6 +17,7 @@ namespace Decompiler {
using Tegra::Shader::Attribute;
using Tegra::Shader::Instruction;
using Tegra::Shader::LogicOperation;
using Tegra::Shader::OpCode;
using Tegra::Shader::Register;
using Tegra::Shader::Sampler;
@@ -265,6 +267,27 @@ public:
BuildRegisterList();
}
/**
* Returns code that does an integer size conversion for the specified size.
* @param value Value to perform integer size conversion on.
* @param size Register size to use for conversion instructions.
* @returns GLSL string corresponding to the value converted to the specified size.
*/
static std::string ConvertIntegerSize(const std::string& value, Register::Size size) {
switch (size) {
case Register::Size::Byte:
return "((" + value + " << 24) >> 24)";
case Register::Size::Short:
return "((" + value + " << 16) >> 16)";
case Register::Size::Word:
// Default - do nothing
return value;
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented conversion size {}", static_cast<u32>(size));
UNREACHABLE();
}
}
/**
* Gets a register as an float.
* @param reg The register to get.
@@ -281,15 +304,18 @@ public:
* @param reg The register to get.
* @param elem The element to use for the operation.
* @param is_signed Whether to get the register as a signed (or unsigned) integer.
* @param size Register size to use for conversion instructions.
* @returns GLSL string corresponding to the register as an integer.
*/
std::string GetRegisterAsInteger(const Register& reg, unsigned elem = 0,
bool is_signed = true) {
std::string GetRegisterAsInteger(const Register& reg, unsigned elem = 0, bool is_signed = true,
Register::Size size = Register::Size::Word) {
const std::string func = GetGLSLConversionFunc(
GLSLRegister::Type::Float,
is_signed ? GLSLRegister::Type::Integer : GLSLRegister::Type::UnsignedInteger);
return func + '(' + GetRegister(reg, elem) + ')';
std::string value = func + '(' + GetRegister(reg, elem) + ')';
return ConvertIntegerSize(value, size);
}
/**
@@ -319,19 +345,20 @@ public:
* @param value_num_components Number of components in the value.
* @param is_saturated Optional, when True, saturates the provided value.
* @param dest_elem Optional, the destination element to use for the operation.
* @param size Register size to use for conversion instructions.
*/
void SetRegisterToInteger(const Register& reg, bool is_signed, u64 elem,
const std::string& value, u64 dest_num_components,
u64 value_num_components, bool is_saturated = false,
u64 dest_elem = 0) {
u64 dest_elem = 0, Register::Size size = Register::Size::Word) {
ASSERT_MSG(!is_saturated, "Unimplemented");
const std::string func = GetGLSLConversionFunc(
is_signed ? GLSLRegister::Type::Integer : GLSLRegister::Type::UnsignedInteger,
GLSLRegister::Type::Float);
SetRegister(reg, elem, func + '(' + value + ')', dest_num_components, value_num_components,
dest_elem);
SetRegister(reg, elem, func + '(' + ConvertIntegerSize(value, size) + ')',
dest_num_components, value_num_components, dest_elem);
}
/**
@@ -371,7 +398,8 @@ public:
/// Generates code representing a uniform (C buffer) register, interpreted as the input type.
std::string GetUniform(u64 index, u64 offset, GLSLRegister::Type type) {
declr_const_buffers[index].MarkAsUsed(index, offset, stage);
std::string value = 'c' + std::to_string(index) + '[' + std::to_string(offset) + ']';
std::string value = 'c' + std::to_string(index) + '[' + std::to_string(offset / 4) + "][" +
std::to_string(offset % 4) + ']';
if (type == GLSLRegister::Type::Float) {
return value;
@@ -385,8 +413,12 @@ public:
std::string GetUniformIndirect(u64 index, s64 offset, const Register& index_reg,
GLSLRegister::Type type) {
declr_const_buffers[index].MarkAsUsedIndirect(index, stage);
std::string value = 'c' + std::to_string(index) + "[(floatBitsToInt(" +
GetRegister(index_reg, 0) + ") + " + std::to_string(offset) + ") / 4]";
std::string final_offset = "((floatBitsToInt(" + GetRegister(index_reg, 0) + ") + " +
std::to_string(offset) + ") / 4)";
std::string value =
'c' + std::to_string(index) + '[' + final_offset + " / 4][" + final_offset + " % 4]";
if (type == GLSLRegister::Type::Float) {
return value;
@@ -428,9 +460,10 @@ public:
unsigned const_buffer_layout = 0;
for (const auto& entry : GetConstBuffersDeclarations()) {
declarations.AddLine("layout(std430) buffer " + entry.GetName());
declarations.AddLine("layout(std140) uniform " + entry.GetName());
declarations.AddLine('{');
declarations.AddLine(" float c" + std::to_string(entry.GetIndex()) + "[];");
declarations.AddLine(" vec4 c" + std::to_string(entry.GetIndex()) +
"[MAX_CONSTBUFFER_ELEMENTS];");
declarations.AddLine("};");
declarations.AddNewLine();
++const_buffer_layout;
@@ -509,7 +542,7 @@ private:
*/
void SetRegister(const Register& reg, u64 elem, const std::string& value,
u64 dest_num_components, u64 value_num_components, u64 dest_elem) {
std::string dest = GetRegister(reg, dest_elem);
std::string dest = GetRegister(reg, static_cast<u32>(dest_elem));
if (dest_num_components > 1) {
dest += GetSwizzle(elem);
}
@@ -686,21 +719,31 @@ private:
/**
* Returns the comparison string to use to compare two values in the 'set' family of
* instructions.
* @params condition The condition used in the 'set'-family instruction.
* @param condition The condition used in the 'set'-family instruction.
* @param op_a First operand to use for the comparison.
* @param op_b Second operand to use for the comparison.
* @returns String corresponding to the GLSL operator that matches the desired comparison.
*/
std::string GetPredicateComparison(Tegra::Shader::PredCondition condition) const {
std::string GetPredicateComparison(Tegra::Shader::PredCondition condition,
const std::string& op_a, const std::string& op_b) const {
using Tegra::Shader::PredCondition;
static const std::unordered_map<PredCondition, const char*> PredicateComparisonStrings = {
{PredCondition::LessThan, "<"}, {PredCondition::Equal, "=="},
{PredCondition::LessEqual, "<="}, {PredCondition::GreaterThan, ">"},
{PredCondition::NotEqual, "!="}, {PredCondition::GreaterEqual, ">="},
{PredCondition::LessThan, "<"}, {PredCondition::Equal, "=="},
{PredCondition::LessEqual, "<="}, {PredCondition::GreaterThan, ">"},
{PredCondition::NotEqual, "!="}, {PredCondition::GreaterEqual, ">="},
{PredCondition::NotEqualWithNan, "!="},
};
auto comparison = PredicateComparisonStrings.find(condition);
const auto& comparison{PredicateComparisonStrings.find(condition)};
ASSERT_MSG(comparison != PredicateComparisonStrings.end(),
"Unknown predicate comparison operation");
return comparison->second;
std::string predicate{'(' + op_a + ") " + comparison->second + " (" + op_b + ')'};
if (condition == PredCondition::NotEqualWithNan) {
predicate += " || isnan(" + op_a + ") || isnan(" + op_b + ')';
}
return predicate;
}
/**
@@ -734,6 +777,31 @@ private:
return (absolute_offset % SchedPeriod) == 0;
}
void WriteLogicOperation(Register dest, LogicOperation logic_op, const std::string& op_a,
const std::string& op_b) {
switch (logic_op) {
case LogicOperation::And: {
regs.SetRegisterToInteger(dest, true, 0, '(' + op_a + " & " + op_b + ')', 1, 1);
break;
}
case LogicOperation::Or: {
regs.SetRegisterToInteger(dest, true, 0, '(' + op_a + " | " + op_b + ')', 1, 1);
break;
}
case LogicOperation::Xor: {
regs.SetRegisterToInteger(dest, true, 0, '(' + op_a + " ^ " + op_b + ')', 1, 1);
break;
}
case LogicOperation::PassB: {
regs.SetRegisterToInteger(dest, true, 0, op_b, 1, 1);
break;
}
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented logic operation: {}", static_cast<u32>(logic_op));
UNREACHABLE();
}
}
/**
* Compiles a single instruction from Tegra to GLSL.
* @param offset the offset of the Tegra shader instruction.
@@ -753,6 +821,7 @@ private:
if (!opcode) {
NGLOG_CRITICAL(HW_GPU, "Unhandled instruction: {0:x}", instr.value);
UNREACHABLE();
return offset + 1;
}
shader.AddLine("// " + std::to_string(offset) + ": " + opcode->GetName());
@@ -771,22 +840,25 @@ private:
switch (opcode->GetType()) {
case OpCode::Type::Arithmetic: {
std::string op_a = instr.alu.negate_a ? "-" : "";
op_a += regs.GetRegisterAsFloat(instr.gpr8);
std::string op_a = regs.GetRegisterAsFloat(instr.gpr8);
if (instr.alu.abs_a) {
op_a = "abs(" + op_a + ')';
}
std::string op_b = instr.alu.negate_b ? "-" : "";
if (instr.alu.negate_a) {
op_a = "-(" + op_a + ')';
}
std::string op_b;
if (instr.is_b_imm) {
op_b += GetImmediate19(instr);
op_b = GetImmediate19(instr);
} else {
if (instr.is_b_gpr) {
op_b += regs.GetRegisterAsFloat(instr.gpr20);
op_b = regs.GetRegisterAsFloat(instr.gpr20);
} else {
op_b += regs.GetUniform(instr.cbuf34.index, instr.cbuf34.offset,
GLSLRegister::Type::Float);
op_b = regs.GetUniform(instr.cbuf34.index, instr.cbuf34.offset,
GLSLRegister::Type::Float);
}
}
@@ -794,6 +866,10 @@ private:
op_b = "abs(" + op_b + ')';
}
if (instr.alu.negate_b) {
op_b = "-(" + op_b + ')';
}
switch (opcode->GetId()) {
case OpCode::Id::MOV_C:
case OpCode::Id::MOV_R: {
@@ -801,11 +877,6 @@ private:
break;
}
case OpCode::Id::MOV32_IMM: {
// mov32i doesn't have abs or neg bits.
regs.SetRegisterToFloat(instr.gpr0, 0, GetImmediate32(instr), 1, 1);
break;
}
case OpCode::Id::FMUL_C:
case OpCode::Id::FMUL_R:
case OpCode::Id::FMUL_IMM: {
@@ -813,13 +884,6 @@ private:
instr.alu.saturate_d);
break;
}
case OpCode::Id::FMUL32_IMM: {
// fmul32i doesn't have abs or neg bits.
regs.SetRegisterToFloat(
instr.gpr0, 0,
regs.GetRegisterAsFloat(instr.gpr8) + " * " + GetImmediate32(instr), 1, 1);
break;
}
case OpCode::Id::FADD_C:
case OpCode::Id::FADD_R:
case OpCode::Id::FADD_IMM: {
@@ -853,10 +917,6 @@ private:
regs.SetRegisterToFloat(instr.gpr0, 0, "inversesqrt(" + op_a + ')', 1, 1,
instr.alu.saturate_d);
break;
case SubOp::Min:
regs.SetRegisterToFloat(instr.gpr0, 0, "min(" + op_a + "," + op_b + ')', 1, 1,
instr.alu.saturate_d);
break;
default:
NGLOG_CRITICAL(HW_GPU, "Unhandled MUFU sub op: {0:x}",
static_cast<unsigned>(instr.sub_op.Value()));
@@ -892,6 +952,21 @@ private:
}
break;
}
case OpCode::Type::ArithmeticImmediate: {
switch (opcode->GetId()) {
case OpCode::Id::MOV32_IMM: {
regs.SetRegisterToFloat(instr.gpr0, 0, GetImmediate32(instr), 1, 1);
break;
}
case OpCode::Id::FMUL32_IMM: {
regs.SetRegisterToFloat(
instr.gpr0, 0,
regs.GetRegisterAsFloat(instr.gpr8) + " * " + GetImmediate32(instr), 1, 1);
break;
}
}
break;
}
case OpCode::Type::Bfe: {
ASSERT_MSG(!instr.bfe.negate_b, "Unimplemented");
@@ -917,49 +992,6 @@ private:
break;
}
case OpCode::Type::Logic: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8, 0, true);
if (instr.alu.lop.invert_a)
op_a = "~(" + op_a + ')';
switch (opcode->GetId()) {
case OpCode::Id::LOP32I: {
u32 imm = static_cast<u32>(instr.alu.imm20_32.Value());
if (instr.alu.lop.invert_b)
imm = ~imm;
switch (instr.alu.lop.operation) {
case Tegra::Shader::LogicOperation::And: {
regs.SetRegisterToInteger(instr.gpr0, true, 0,
'(' + op_a + " & " + std::to_string(imm) + ')', 1, 1);
break;
}
case Tegra::Shader::LogicOperation::Or: {
regs.SetRegisterToInteger(instr.gpr0, true, 0,
'(' + op_a + " | " + std::to_string(imm) + ')', 1, 1);
break;
}
case Tegra::Shader::LogicOperation::Xor: {
regs.SetRegisterToInteger(instr.gpr0, true, 0,
'(' + op_a + " ^ " + std::to_string(imm) + ')', 1, 1);
break;
}
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented lop32i operation: {}",
static_cast<u32>(instr.alu.lop.operation.Value()));
UNREACHABLE();
}
break;
}
default: {
NGLOG_CRITICAL(HW_GPU, "Unhandled logic instruction: {}", opcode->GetName());
UNREACHABLE();
}
}
break;
}
case OpCode::Type::Shift: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8, 0, true);
@@ -1005,17 +1037,26 @@ private:
case OpCode::Type::ArithmeticIntegerImmediate: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8);
if (instr.iadd32i.negate_a)
op_a = '-' + op_a;
std::string op_b = '(' + std::to_string(instr.alu.imm20_32.Value()) + ')';
std::string op_b = std::to_string(instr.alu.imm20_32.Value());
switch (opcode->GetId()) {
case OpCode::Id::IADD32I:
if (instr.iadd32i.negate_a)
op_a = "-(" + op_a + ')';
regs.SetRegisterToInteger(instr.gpr0, true, 0, op_a + " + " + op_b, 1, 1,
instr.iadd32i.saturate != 0);
break;
case OpCode::Id::LOP32I: {
if (instr.alu.lop32i.invert_a)
op_a = "~(" + op_a + ')';
if (instr.alu.lop32i.invert_b)
op_b = "~(" + op_b + ')';
WriteLogicOperation(instr.gpr0, instr.alu.lop32i.operation, op_a, op_b);
break;
}
default: {
NGLOG_CRITICAL(HW_GPU, "Unhandled ArithmeticIntegerImmediate instruction: {}",
opcode->GetName());
@@ -1026,12 +1067,7 @@ private:
}
case OpCode::Type::ArithmeticInteger: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8);
if (instr.alu_integer.negate_a)
op_a = '-' + op_a;
std::string op_b = instr.alu_integer.negate_b ? "-" : "";
std::string op_b;
if (instr.is_b_imm) {
op_b += '(' + std::to_string(instr.alu.GetSignedImm20_20()) + ')';
} else {
@@ -1047,6 +1083,12 @@ private:
case OpCode::Id::IADD_C:
case OpCode::Id::IADD_R:
case OpCode::Id::IADD_IMM: {
if (instr.alu_integer.negate_a)
op_a = "-(" + op_a + ')';
if (instr.alu_integer.negate_b)
op_b = "-(" + op_b + ')';
regs.SetRegisterToInteger(instr.gpr0, true, 0, op_a + " + " + op_b, 1, 1,
instr.alu.saturate_d);
break;
@@ -1054,12 +1096,33 @@ private:
case OpCode::Id::ISCADD_C:
case OpCode::Id::ISCADD_R:
case OpCode::Id::ISCADD_IMM: {
if (instr.alu_integer.negate_a)
op_a = "-(" + op_a + ')';
if (instr.alu_integer.negate_b)
op_b = "-(" + op_b + ')';
std::string shift = std::to_string(instr.alu_integer.shift_amount.Value());
regs.SetRegisterToInteger(instr.gpr0, true, 0,
"((" + op_a + " << " + shift + ") + " + op_b + ')', 1, 1);
break;
}
case OpCode::Id::LOP_C:
case OpCode::Id::LOP_R:
case OpCode::Id::LOP_IMM: {
ASSERT_MSG(!instr.alu.lop.unk44, "Unimplemented");
ASSERT_MSG(instr.alu.lop.pred48 == Pred::UnusedIndex, "Unimplemented");
if (instr.alu.lop.invert_a)
op_a = "~(" + op_a + ')';
if (instr.alu.lop.invert_b)
op_b = "~(" + op_b + ')';
WriteLogicOperation(instr.gpr0, instr.alu.lop.operation, op_a, op_b);
break;
}
default: {
NGLOG_CRITICAL(HW_GPU, "Unhandled ArithmeticInteger instruction: {}",
opcode->GetName());
@@ -1108,28 +1171,28 @@ private:
break;
}
case OpCode::Type::Conversion: {
ASSERT_MSG(instr.conversion.size == Register::Size::Word, "Unimplemented");
ASSERT_MSG(!instr.conversion.negate_a, "Unimplemented");
switch (opcode->GetId()) {
case OpCode::Id::I2I_R: {
ASSERT_MSG(!instr.conversion.selector, "Unimplemented");
std::string op_a =
regs.GetRegisterAsInteger(instr.gpr20, 0, instr.conversion.is_input_signed);
std::string op_a = regs.GetRegisterAsInteger(
instr.gpr20, 0, instr.conversion.is_input_signed, instr.conversion.src_size);
if (instr.conversion.abs_a) {
op_a = "abs(" + op_a + ')';
}
regs.SetRegisterToInteger(instr.gpr0, instr.conversion.is_output_signed, 0, op_a, 1,
1, instr.alu.saturate_d);
1, instr.alu.saturate_d, 0, instr.conversion.dest_size);
break;
}
case OpCode::Id::I2F_R: {
ASSERT_MSG(instr.conversion.dest_size == Register::Size::Word, "Unimplemented");
ASSERT_MSG(!instr.conversion.selector, "Unimplemented");
std::string op_a =
regs.GetRegisterAsInteger(instr.gpr20, 0, instr.conversion.is_input_signed);
std::string op_a = regs.GetRegisterAsInteger(
instr.gpr20, 0, instr.conversion.is_input_signed, instr.conversion.src_size);
if (instr.conversion.abs_a) {
op_a = "abs(" + op_a + ')';
@@ -1139,6 +1202,8 @@ private:
break;
}
case OpCode::Id::F2F_R: {
ASSERT_MSG(instr.conversion.dest_size == Register::Size::Word, "Unimplemented");
ASSERT_MSG(instr.conversion.src_size == Register::Size::Word, "Unimplemented");
std::string op_a = regs.GetRegisterAsFloat(instr.gpr20);
switch (instr.conversion.f2f.rounding) {
@@ -1168,6 +1233,7 @@ private:
break;
}
case OpCode::Id::F2I_R: {
ASSERT_MSG(instr.conversion.src_size == Register::Size::Word, "Unimplemented");
std::string op_a = regs.GetRegisterAsFloat(instr.gpr20);
if (instr.conversion.abs_a) {
@@ -1200,7 +1266,7 @@ private:
}
regs.SetRegisterToInteger(instr.gpr0, instr.conversion.is_output_signed, 0, op_a, 1,
1);
1, false, 0, instr.conversion.dest_size);
break;
}
default: {
@@ -1355,10 +1421,9 @@ private:
std::string second_pred =
GetPredicateCondition(instr.fsetp.pred39, instr.fsetp.neg_pred != 0);
std::string comparator = GetPredicateComparison(instr.fsetp.cond);
std::string combiner = GetPredicateCombiner(instr.fsetp.op);
std::string predicate = '(' + op_a + ") " + comparator + " (" + op_b + ')';
std::string predicate = GetPredicateComparison(instr.fsetp.cond, op_a, op_b);
// Set the primary predicate to the result of Predicate OP SecondPredicate
SetPredicate(instr.fsetp.pred3,
'(' + predicate + ") " + combiner + " (" + second_pred + ')');
@@ -1393,10 +1458,9 @@ private:
std::string second_pred =
GetPredicateCondition(instr.isetp.pred39, instr.isetp.neg_pred != 0);
std::string comparator = GetPredicateComparison(instr.isetp.cond);
std::string combiner = GetPredicateCombiner(instr.isetp.op);
std::string predicate = '(' + op_a + ") " + comparator + " (" + op_b + ')';
std::string predicate = GetPredicateComparison(instr.isetp.cond, op_a, op_b);
// Set the primary predicate to the result of Predicate OP SecondPredicate
SetPredicate(instr.isetp.pred3,
'(' + predicate + ") " + combiner + " (" + second_pred + ')');
@@ -1443,11 +1507,10 @@ private:
std::string second_pred =
GetPredicateCondition(instr.fset.pred39, instr.fset.neg_pred != 0);
std::string comparator = GetPredicateComparison(instr.fset.cond);
std::string combiner = GetPredicateCombiner(instr.fset.op);
std::string predicate = "(((" + op_a + ") " + comparator + " (" + op_b + ")) " +
combiner + " (" + second_pred + "))";
std::string predicate = "((" + GetPredicateComparison(instr.fset.cond, op_a, op_b) +
") " + combiner + " (" + second_pred + "))";
if (instr.fset.bf) {
regs.SetRegisterToFloat(instr.gpr0, 0, predicate + " ? 1.0 : 0.0", 1, 1);
@@ -1478,11 +1541,10 @@ private:
std::string second_pred =
GetPredicateCondition(instr.iset.pred39, instr.iset.neg_pred != 0);
std::string comparator = GetPredicateComparison(instr.iset.cond);
std::string combiner = GetPredicateCombiner(instr.iset.op);
std::string predicate = "(((" + op_a + ") " + comparator + " (" + op_b + ")) " +
combiner + " (" + second_pred + "))";
std::string predicate = "((" + GetPredicateComparison(instr.iset.cond, op_a, op_b) +
") " + combiner + " (" + second_pred + "))";
if (instr.iset.bf) {
regs.SetRegisterToFloat(instr.gpr0, 0, predicate + " ? 1.0 : 0.0", 1, 1);
@@ -1661,7 +1723,10 @@ private:
}; // namespace Decompiler
std::string GetCommonDeclarations() {
return "bool exec_shader();";
std::string declarations = "bool exec_shader();\n";
declarations += "#define MAX_CONSTBUFFER_ELEMENTS " +
std::to_string(RasterizerOpenGL::MaxConstbufferSize / (sizeof(GLvec4)));
return declarations;
}
boost::optional<ProgramResult> DecompileProgram(const ProgramCode& program_code, u32 main_offset,

View File

@@ -39,6 +39,10 @@ void main() {
// Viewport can be flipped, which is unsupported by glViewport
position.xy *= viewport_flip.xy;
gl_Position = position;
// TODO(bunnei): This is likely a hack, position.w should be interpolated as 1.0
// For now, this is here to bring order in lieu of proper emulation
position.w = 1.0;
}
)";
out += program.first;

View File

@@ -38,8 +38,8 @@ void MaxwellUniformData::SetFromRegs(const Maxwell3D::State::ShaderStageInfo& sh
const auto& regs = Core::System().GetInstance().GPU().Maxwell3D().regs;
// TODO(bunnei): Support more than one viewport
viewport_flip[0] = regs.viewport_transform[0].scale_x < 0.0 ? -1.0 : 1.0;
viewport_flip[1] = regs.viewport_transform[0].scale_y < 0.0 ? -1.0 : 1.0;
viewport_flip[0] = regs.viewport_transform[0].scale_x < 0.0 ? -1.0f : 1.0f;
viewport_flip[1] = regs.viewport_transform[0].scale_y < 0.0 ? -1.0f : 1.0f;
}
} // namespace GLShader

View File

@@ -48,24 +48,9 @@ OpenGLState::OpenGLState() {
logic_op = GL_COPY;
for (auto& texture_unit : texture_units) {
texture_unit.texture_2d = 0;
texture_unit.sampler = 0;
texture_unit.swizzle.r = GL_RED;
texture_unit.swizzle.g = GL_GREEN;
texture_unit.swizzle.b = GL_BLUE;
texture_unit.swizzle.a = GL_ALPHA;
texture_unit.Reset();
}
lighting_lut.texture_buffer = 0;
fog_lut.texture_buffer = 0;
proctex_lut.texture_buffer = 0;
proctex_diff_lut.texture_buffer = 0;
proctex_color_map.texture_buffer = 0;
proctex_alpha_map.texture_buffer = 0;
proctex_noise_lut.texture_buffer = 0;
draw.read_framebuffer = 0;
draw.draw_framebuffer = 0;
draw.vertex_array = 0;
@@ -196,13 +181,13 @@ void OpenGLState::Apply() const {
}
// Textures
for (size_t i = 0; i < std::size(texture_units); ++i) {
for (int i = 0; i < std::size(texture_units); ++i) {
if (texture_units[i].texture_2d != cur_state.texture_units[i].texture_2d) {
glActiveTexture(TextureUnits::MaxwellTexture(i).Enum());
glBindTexture(GL_TEXTURE_2D, texture_units[i].texture_2d);
}
if (texture_units[i].sampler != cur_state.texture_units[i].sampler) {
glBindSampler(i, texture_units[i].sampler);
glBindSampler(static_cast<GLuint>(i), texture_units[i].sampler);
}
// Update the texture swizzle
if (texture_units[i].swizzle.r != cur_state.texture_units[i].swizzle.r ||
@@ -223,54 +208,12 @@ void OpenGLState::Apply() const {
if (current.enabled != new_state.enabled || current.bindpoint != new_state.bindpoint ||
current.ssbo != new_state.ssbo) {
if (new_state.enabled) {
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, new_state.bindpoint, new_state.ssbo);
glBindBufferBase(GL_UNIFORM_BUFFER, new_state.bindpoint, new_state.ssbo);
}
}
}
}
// Lighting LUTs
if (lighting_lut.texture_buffer != cur_state.lighting_lut.texture_buffer) {
glActiveTexture(TextureUnits::LightingLUT.Enum());
glBindTexture(GL_TEXTURE_BUFFER, lighting_lut.texture_buffer);
}
// Fog LUT
if (fog_lut.texture_buffer != cur_state.fog_lut.texture_buffer) {
glActiveTexture(TextureUnits::FogLUT.Enum());
glBindTexture(GL_TEXTURE_BUFFER, fog_lut.texture_buffer);
}
// ProcTex Noise LUT
if (proctex_noise_lut.texture_buffer != cur_state.proctex_noise_lut.texture_buffer) {
glActiveTexture(TextureUnits::ProcTexNoiseLUT.Enum());
glBindTexture(GL_TEXTURE_BUFFER, proctex_noise_lut.texture_buffer);
}
// ProcTex Color Map
if (proctex_color_map.texture_buffer != cur_state.proctex_color_map.texture_buffer) {
glActiveTexture(TextureUnits::ProcTexColorMap.Enum());
glBindTexture(GL_TEXTURE_BUFFER, proctex_color_map.texture_buffer);
}
// ProcTex Alpha Map
if (proctex_alpha_map.texture_buffer != cur_state.proctex_alpha_map.texture_buffer) {
glActiveTexture(TextureUnits::ProcTexAlphaMap.Enum());
glBindTexture(GL_TEXTURE_BUFFER, proctex_alpha_map.texture_buffer);
}
// ProcTex LUT
if (proctex_lut.texture_buffer != cur_state.proctex_lut.texture_buffer) {
glActiveTexture(TextureUnits::ProcTexLUT.Enum());
glBindTexture(GL_TEXTURE_BUFFER, proctex_lut.texture_buffer);
}
// ProcTex Diff LUT
if (proctex_diff_lut.texture_buffer != cur_state.proctex_diff_lut.texture_buffer) {
glActiveTexture(TextureUnits::ProcTexDiffLUT.Enum());
glBindTexture(GL_TEXTURE_BUFFER, proctex_diff_lut.texture_buffer);
}
// Framebuffer
if (draw.read_framebuffer != cur_state.draw.read_framebuffer) {
glBindFramebuffer(GL_READ_FRAMEBUFFER, draw.read_framebuffer);
@@ -338,26 +281,12 @@ void OpenGLState::Apply() const {
cur_state = *this;
}
OpenGLState& OpenGLState::ResetTexture(GLuint handle) {
OpenGLState& OpenGLState::UnbindTexture(GLuint handle) {
for (auto& unit : texture_units) {
if (unit.texture_2d == handle) {
unit.texture_2d = 0;
unit.Unbind();
}
}
if (lighting_lut.texture_buffer == handle)
lighting_lut.texture_buffer = 0;
if (fog_lut.texture_buffer == handle)
fog_lut.texture_buffer = 0;
if (proctex_noise_lut.texture_buffer == handle)
proctex_noise_lut.texture_buffer = 0;
if (proctex_color_map.texture_buffer == handle)
proctex_color_map.texture_buffer = 0;
if (proctex_alpha_map.texture_buffer == handle)
proctex_alpha_map.texture_buffer = 0;
if (proctex_lut.texture_buffer == handle)
proctex_lut.texture_buffer = 0;
if (proctex_diff_lut.texture_buffer == handle)
proctex_diff_lut.texture_buffer = 0;
return *this;
}

View File

@@ -91,36 +91,21 @@ public:
GLint b; // GL_TEXTURE_SWIZZLE_B
GLint a; // GL_TEXTURE_SWIZZLE_A
} swizzle;
void Unbind() {
texture_2d = 0;
swizzle.r = GL_RED;
swizzle.g = GL_GREEN;
swizzle.b = GL_BLUE;
swizzle.a = GL_ALPHA;
}
void Reset() {
Unbind();
sampler = 0;
}
} texture_units[32];
struct {
GLuint texture_buffer; // GL_TEXTURE_BINDING_BUFFER
} lighting_lut;
struct {
GLuint texture_buffer; // GL_TEXTURE_BINDING_BUFFER
} fog_lut;
struct {
GLuint texture_buffer; // GL_TEXTURE_BINDING_BUFFER
} proctex_noise_lut;
struct {
GLuint texture_buffer; // GL_TEXTURE_BINDING_BUFFER
} proctex_color_map;
struct {
GLuint texture_buffer; // GL_TEXTURE_BINDING_BUFFER
} proctex_alpha_map;
struct {
GLuint texture_buffer; // GL_TEXTURE_BINDING_BUFFER
} proctex_lut;
struct {
GLuint texture_buffer; // GL_TEXTURE_BINDING_BUFFER
} proctex_diff_lut;
struct {
GLuint read_framebuffer; // GL_READ_FRAMEBUFFER_BINDING
GLuint draw_framebuffer; // GL_DRAW_FRAMEBUFFER_BINDING
@@ -165,7 +150,7 @@ public:
void Apply() const;
/// Resets any references to the given resource
OpenGLState& ResetTexture(GLuint handle);
OpenGLState& UnbindTexture(GLuint handle);
OpenGLState& ResetSampler(GLuint handle);
OpenGLState& ResetProgram(GLuint handle);
OpenGLState& ResetPipeline(GLuint handle);

View File

@@ -150,7 +150,6 @@ void RendererOpenGL::LoadFBToScreenInfo(const Tegra::FramebufferConfig& framebuf
screen_info)) {
// Reset the screen info's display texture to its own permanent texture
screen_info.display_texture = screen_info.texture.resource.handle;
screen_info.display_texcoords = MathUtil::Rectangle<float>(0.f, 0.f, 1.f, 1.f);
Memory::RasterizerFlushVirtualRegion(framebuffer_addr, size_in_bytes,
Memory::FlushMode::Flush);

View File

@@ -27,7 +27,7 @@ struct TextureInfo {
/// Structure used for storing information about the display target for the Switch screen
struct ScreenInfo {
GLuint display_texture;
MathUtil::Rectangle<float> display_texcoords;
const MathUtil::Rectangle<float> display_texcoords{0.0f, 0.0f, 1.0f, 1.0f};
TextureInfo texture;
};

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,15 @@
// Copyright 2018 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <cstdint>
#include <vector>
namespace Tegra::Texture::ASTC {
std::vector<uint8_t> Decompress(std::vector<uint8_t>& data, uint32_t width, uint32_t height,
uint32_t block_width, uint32_t block_height);
} // namespace Tegra::Texture::ASTC

View File

@@ -53,6 +53,7 @@ u32 BytesPerPixel(TextureFormat format) {
case TextureFormat::DXT45:
// In this case a 'pixel' actually refers to a 4x4 tile.
return 16;
case TextureFormat::ASTC_2D_4X4:
case TextureFormat::A8R8G8B8:
case TextureFormat::A2B10G10R10:
case TextureFormat::BF10GF11RF11:
@@ -64,6 +65,8 @@ u32 BytesPerPixel(TextureFormat format) {
return 1;
case TextureFormat::R16_G16_B16_A16:
return 8;
case TextureFormat::R32_G32_B32_A32:
return 16;
default:
UNIMPLEMENTED_MSG("Format not implemented");
break;
@@ -93,7 +96,9 @@ std::vector<u8> UnswizzleTexture(VAddr address, TextureFormat format, u32 width,
case TextureFormat::B5G6R5:
case TextureFormat::R8:
case TextureFormat::R16_G16_B16_A16:
case TextureFormat::R32_G32_B32_A32:
case TextureFormat::BF10GF11RF11:
case TextureFormat::ASTC_2D_4X4:
CopySwizzledData(width, height, bytes_per_pixel, bytes_per_pixel, data,
unswizzled_data.data(), true, block_height);
break;
@@ -115,12 +120,14 @@ std::vector<u8> DecodeTexture(const std::vector<u8>& texture_data, TextureFormat
case TextureFormat::DXT23:
case TextureFormat::DXT45:
case TextureFormat::DXN1:
case TextureFormat::ASTC_2D_4X4:
case TextureFormat::A8R8G8B8:
case TextureFormat::A2B10G10R10:
case TextureFormat::A1B5G5R5:
case TextureFormat::B5G6R5:
case TextureFormat::R8:
case TextureFormat::BF10GF11RF11:
case TextureFormat::R32_G32_B32_A32:
// TODO(Subv): For the time being just forward the same data without any decoding.
rgba_data = texture_data;
break;

View File

@@ -84,6 +84,8 @@ void Config::ReadValues() {
qt_config->beginGroup("Renderer");
Settings::values.resolution_factor = qt_config->value("resolution_factor", 1.0).toFloat();
Settings::values.toggle_framelimit = qt_config->value("toggle_framelimit", true).toBool();
Settings::values.use_accurate_framebuffers =
qt_config->value("use_accurate_framebuffers", false).toBool();
Settings::values.bg_red = qt_config->value("bg_red", 0.0).toFloat();
Settings::values.bg_green = qt_config->value("bg_green", 0.0).toFloat();
@@ -184,6 +186,7 @@ void Config::SaveValues() {
qt_config->beginGroup("Renderer");
qt_config->setValue("resolution_factor", (double)Settings::values.resolution_factor);
qt_config->setValue("toggle_framelimit", Settings::values.toggle_framelimit);
qt_config->setValue("use_accurate_framebuffers", Settings::values.use_accurate_framebuffers);
// Cast to double because Qt's written float values are not human-readable
qt_config->setValue("bg_red", (double)Settings::values.bg_red);

View File

@@ -59,11 +59,13 @@ void ConfigureGraphics::setConfiguration() {
ui->resolution_factor_combobox->setCurrentIndex(
static_cast<int>(FromResolutionFactor(Settings::values.resolution_factor)));
ui->toggle_framelimit->setChecked(Settings::values.toggle_framelimit);
ui->use_accurate_framebuffers->setChecked(Settings::values.use_accurate_framebuffers);
}
void ConfigureGraphics::applyConfiguration() {
Settings::values.resolution_factor =
ToResolutionFactor(static_cast<Resolution>(ui->resolution_factor_combobox->currentIndex()));
Settings::values.toggle_framelimit = ui->toggle_framelimit->isChecked();
Settings::values.use_accurate_framebuffers = ui->use_accurate_framebuffers->isChecked();
Settings::Apply();
}

View File

@@ -29,6 +29,13 @@
</property>
</widget>
</item>
<item>
<widget class="QCheckBox" name="use_accurate_framebuffers">
<property name="text">
<string>Use accurate framebuffers (slow)</string>
</property>
</widget>
</item>
<item>
<layout class="QHBoxLayout" name="horizontalLayout">
<item>

View File

@@ -213,6 +213,9 @@ QString WaitTreeThread::GetText() const {
case THREADSTATUS_WAIT_MUTEX:
status = tr("waiting for mutex");
break;
case THREADSTATUS_WAIT_ARB:
status = tr("waiting for address arbiter");
break;
case THREADSTATUS_DORMANT:
status = tr("dormant");
break;
@@ -240,6 +243,7 @@ QColor WaitTreeThread::GetColor() const {
case THREADSTATUS_WAIT_SYNCH_ALL:
case THREADSTATUS_WAIT_SYNCH_ANY:
case THREADSTATUS_WAIT_MUTEX:
case THREADSTATUS_WAIT_ARB:
return QColor(Qt::GlobalColor::red);
case THREADSTATUS_DORMANT:
return QColor(Qt::GlobalColor::darkCyan);

View File

@@ -366,7 +366,7 @@ void GameList::LoadInterfaceLayout() {
item_model->sort(header->sortIndicatorSection(), header->sortIndicatorOrder());
}
const QStringList GameList::supported_file_extensions = {"nso", "nro"};
const QStringList GameList::supported_file_extensions = {"nso", "nro", "nca"};
static bool HasSupportedFileExtension(const std::string& file_name) {
QFileInfo file = QFileInfo(file_name.c_str());

View File

@@ -334,8 +334,6 @@ bool GMainWindow::SupportsRequiredGLExtensions() {
unsupported_ext.append("ARB_program_interface_query");
if (!GLAD_GL_ARB_separate_shader_objects)
unsupported_ext.append("ARB_separate_shader_objects");
if (!GLAD_GL_ARB_shader_storage_buffer_object)
unsupported_ext.append("ARB_shader_storage_buffer_object");
if (!GLAD_GL_ARB_vertex_attrib_binding)
unsupported_ext.append("ARB_vertex_attrib_binding");

View File

@@ -98,6 +98,8 @@ void Config::ReadValues() {
(float)sdl2_config->GetReal("Renderer", "resolution_factor", 1.0);
Settings::values.toggle_framelimit =
sdl2_config->GetBoolean("Renderer", "toggle_framelimit", true);
Settings::values.use_accurate_framebuffers =
sdl2_config->GetBoolean("Renderer", "use_accurate_framebuffers", false);
Settings::values.bg_red = (float)sdl2_config->GetReal("Renderer", "bg_red", 0.0);
Settings::values.bg_green = (float)sdl2_config->GetReal("Renderer", "bg_green", 0.0);

View File

@@ -102,6 +102,10 @@ resolution_factor =
# 0 (default): Off, 1: On
use_vsync =
# Whether to use accurate framebuffers
# 0 (default): Off (fast), 1 : On (slow)
use_accurate_framebuffers =
# The clear color for the renderer. What shows up on the sides of the bottom screen.
# Must be in range of 0.0-1.0. Defaults to 1.0 for all.
bg_red =