Compare commits

..

47 Commits

Author SHA1 Message Date
Morph
c9710f6c78 api_version: Update and add AtmosphereTargetFirmware 2021-09-10 01:10:47 -04:00
bunnei
7e9163779d Merge pull request #6962 from vonchenplus/spirv_support_legacy_attribute
renderer_vulkan: Spirv support glsl  legacy attribute
2021-09-08 14:04:44 -07:00
Fernando S
6b16f7807e Merge pull request #6980 from vonchenplus/fix_blend_equation_error
Fix blend equation enum error
2021-09-08 11:50:26 +02:00
Ameer J
eb1ba45c39 Merge pull request #6971 from bunnei/buffer-queue-kevent
core: hle: service: buffer_queue: Improve management of KEvent.
2021-09-08 00:34:36 -04:00
Feng Chen
b1e655f898 Detail adjustment 2021-09-08 10:30:00 +08:00
Feng Chen
bbc1800c1b Detail adjustment 2021-09-08 09:53:10 +08:00
Feng Chen
e5ca733722 Re-implement get unused location 2021-09-07 13:22:52 +08:00
Feng Chen
9cdf2383e9 Move attribute related definitions to spirv anonymous namespace 2021-09-07 12:34:35 +08:00
Ameer J
ab73787d8f Merge pull request #6977 from Moonlacer/master
Second part of Golden's PR #6976
2021-09-06 22:58:23 -04:00
Ameer J
743428e025 Merge pull request #6976 from goldenx86/patch-2
Rename all shader cache strings to pipeline cache
2021-09-06 22:58:03 -04:00
Feng Chen
0292374807 Fix blend equation enum error 2021-09-07 10:12:09 +08:00
Moonlacer
bdd153bc0d Second part of Golden's PR 2021-09-06 15:25:40 -05:00
Matías Locatti
296fa4e06e Rename all shader cache references to pipeline cache
After Hades, both OpenGL and Vulkan use a pipeline cache instead of single stages of the graphics pipeline. Renamed the Remove menu entries to match.
2021-09-06 15:53:04 -03:00
bunnei
51ccc29cdd Merge pull request #6965 from bunnei/cpu_manager_jthread
core: cpu_manager: Use jthread.
2021-09-06 03:49:14 -07:00
Feng Chen
1de9e4e121 Dynamic get unused location 2021-09-06 10:46:03 +08:00
Feng Chen
d994466a08 Implement intput and output fixed fnc textures 2021-09-06 10:36:45 +08:00
bunnei
e05bfd2f54 core: hle: service: buffer_queue: Improve management of KEvent. 2021-09-04 22:25:46 -07:00
bunnei
d9ce179ec2 Merge pull request #6968 from bunnei/nvflinger-event
core: hle: service: nvflinger/vi: Improve management of KEvent.
2021-09-04 22:25:20 -07:00
bunnei
fb3e9314b9 core: hle: service: nvflinger/vi: Improve management of KEvent. 2021-09-03 21:53:00 -07:00
bunnei
25a97e0139 core: cpu_manager: Use jthread. 2021-09-03 19:05:41 -07:00
Feng Chen
a7bbaa4897 Rename parameters 2021-09-03 23:52:20 +08:00
Feng Chen
cf26f375ff Fix create GraphicsPipelines crash 2021-09-03 22:55:53 +08:00
Feng Chen
1e2a89d306 Add input/output location 2021-09-02 23:34:51 +08:00
bunnei
b2572a56d3 Merge pull request #6900 from ameerj/attr-reorder
structured_control_flow: Add DemoteCombinationPass
2021-09-01 17:36:26 -07:00
Mai M
25444041d0 Merge pull request #6951 from german77/log
common/logging: Add missing include
2021-09-01 20:21:15 -04:00
german77
c57e0b3b24 common/logging: Add missing include 2021-09-01 19:13:33 -05:00
bunnei
956171f024 Merge pull request #6897 from FernandoS27/pineapple-does-not-belong-in-pizza
Project <tentative title>: Rework Garbage Collection.
2021-08-31 09:11:21 -07:00
Feng Chen
73b11f390e Add colorfront and txtcoord support 2021-09-01 00:07:25 +08:00
bunnei
122eb3cbd0 Merge pull request #6928 from ameerj/spirv-get-frontface
emit_spirv_context_get_set: Fix Get FrontFace return value
2021-08-30 18:16:31 -07:00
bunnei
ec19d9594f Merge pull request #6879 from ameerj/decoder-assert
vk_blit_screen: Fix non-accelerated texture size calculation
2021-08-30 15:24:04 -07:00
ameerj
907dfbea71 structured_control_flow: Skip reordering nested demote branches.
Nested demote branches add complexity with combining the condition if it has not been initialized yet. Skip them for the time being.
2021-08-30 11:46:25 -04:00
ameerj
4fda7f1c82 structured_control_flow: Conditionally invoke demote reorder pass
This is only needed on select drivers when a fragment shader discards/demotes.
2021-08-30 11:46:24 -04:00
Fernando Sahmkow
fe0acec539 Garbage Collection: Make it more agressive on high priority mode. 2021-08-29 18:57:17 +02:00
Fernando Sahmkow
ff48f06fb9 Garbage Collection: Adress Feedback. 2021-08-29 18:19:53 +02:00
bunnei
5f19b66189 Merge pull request #6905 from Morph1984/nifm-misc
nifm/network_interface: Cleanup and populate fields in GetCurrentNetworkProfile
2021-08-29 00:04:58 -07:00
bunnei
4e88989435 Merge pull request #6921 from ameerj/vp9-unused
vp9_types: Remove unusued VP9 info struct members
2021-08-28 22:20:09 -07:00
Morph
7d2265b6a8 Merge pull request #6927 from german77/ngct
ngct: Stub NGCT:U service
2021-08-28 23:03:29 -04:00
Fernando Sahmkow
ba82bb359b Garbage Collection: enable as default, eliminate option. 2021-08-28 17:55:37 +02:00
Fernando Sahmkow
d540d284b5 VideoCore: Rework Garbage Collection. 2021-08-28 17:54:12 +02:00
ameerj
862dc2b2b3 structured_control_flow: Add DemoteCombinationPass
Some drivers misread data when demotes are interleaved in the program. This moves demote branches to be checked at the end of the program.
Fixes "wireframe" issue in Pokemon SwSh on some drivers
2021-08-28 11:35:25 -04:00
german77
f134a5e56c ngct: Stub NGCT:U service 2021-08-27 14:15:34 -05:00
bunnei
c20ea89390 Merge pull request #6929 from yuzu-emu/revert-6870-trace-back-stack-back-stack-back
Revert "logging: Display backtrace on crash"
2021-08-27 02:23:01 -07:00
Morph
790a09bc93 Revert "logging: Display backtrace on crash" 2021-08-27 04:01:22 -04:00
ameerj
6e407c02d8 emit_spirv_context_get_set: Fix Get FrontFace return value
The IR expects GetAttribute to return an F32 value. This case was returning a U32 instead.
2021-08-26 21:37:34 -04:00
ameerj
eb2624ed65 vp9_types: Minor refactor of VP9 info structs. 2021-08-25 21:42:43 -04:00
ameerj
3de38c9a70 vp9_types: Remove unused Vp9PictureInfo members 2021-08-25 21:29:22 -04:00
ameerj
537c6ac8fe vk_blit_screen: Fix non-accelerated texture size calculation
Addresses the potential OOB access in UnswizzleTexture.
2021-08-16 14:28:10 -04:00
47 changed files with 667 additions and 381 deletions

View File

@@ -176,6 +176,3 @@ if (MSVC)
else()
target_link_libraries(common PRIVATE zstd)
endif()
if (${CMAKE_SYSTEM_NAME} STREQUAL "Linux" AND CMAKE_CXX_COMPILER_ID STREQUAL GNU)
target_link_libraries(common PRIVATE backtrace)
endif()

View File

@@ -13,19 +13,12 @@
#include <windows.h> // For OutputDebugStringW
#endif
#if defined(__linux__) && defined(__GNUG__) && !defined(__clang__)
#define BOOST_STACKTRACE_USE_BACKTRACE
#include <boost/stacktrace.hpp>
#undef BOOST_STACKTRACE_USE_BACKTRACE
#include <signal.h>
#define YUZU_LINUX_GCC_BACKTRACE
#endif
#include "common/fs/file.h"
#include "common/fs/fs.h"
#include "common/fs/fs_paths.h"
#include "common/fs/path_util.h"
#include "common/literals.h"
#include "common/thread.h"
#include "common/logging/backend.h"
#include "common/logging/log.h"
@@ -163,14 +156,6 @@ public:
bool initialization_in_progress_suppress_logging = true;
#ifdef YUZU_LINUX_GCC_BACKTRACE
[[noreturn]] void SleepForever() {
while (true) {
pause();
}
}
#endif
/**
* Static state as a singleton.
*/
@@ -242,66 +227,9 @@ private:
while (max_logs_to_write-- && message_queue.Pop(entry)) {
write_logs();
}
})} {
#ifdef YUZU_LINUX_GCC_BACKTRACE
int waker_pipefd[2];
int done_printing_pipefd[2];
if (pipe2(waker_pipefd, O_CLOEXEC) || pipe2(done_printing_pipefd, O_CLOEXEC)) {
abort();
}
backtrace_thread_waker_fd = waker_pipefd[1];
backtrace_done_printing_fd = done_printing_pipefd[0];
std::thread([this, wait_fd = waker_pipefd[0], done_fd = done_printing_pipefd[1]] {
Common::SetCurrentThreadName("yuzu:Crash");
for (u8 ignore = 0; read(wait_fd, &ignore, 1) != 1;)
;
const int sig = received_signal;
if (sig <= 0) {
abort();
}
StopBackendThread();
const auto signal_entry =
CreateEntry(Class::Log, Level::Critical, "?", 0, "?",
fmt::vformat("Received signal {}", fmt::make_format_args(sig)));
ForEachBackend([&signal_entry](Backend& backend) {
backend.EnableForStacktrace();
backend.Write(signal_entry);
});
const auto backtrace =
boost::stacktrace::stacktrace::from_dump(backtrace_storage.data(), 4096);
for (const auto& frame : backtrace.as_vector()) {
auto line = boost::stacktrace::detail::to_string(&frame, 1);
if (line.empty()) {
abort();
}
line.pop_back(); // Remove newline
const auto frame_entry =
CreateEntry(Class::Log, Level::Critical, "?", 0, "?", line);
ForEachBackend([&frame_entry](Backend& backend) { backend.Write(frame_entry); });
}
using namespace std::literals;
const auto rip_entry = CreateEntry(Class::Log, Level::Critical, "?", 0, "?", "RIP"s);
ForEachBackend([&rip_entry](Backend& backend) {
backend.Write(rip_entry);
backend.Flush();
});
for (const u8 anything = 0; write(done_fd, &anything, 1) != 1;)
;
// Abort on original thread to help debugging
SleepForever();
}).detach();
signal(SIGSEGV, &HandleSignal);
signal(SIGABRT, &HandleSignal);
#endif
}
})} {}
~Impl() {
#ifdef YUZU_LINUX_GCC_BACKTRACE
if (int zero_or_ignore = 0;
!received_signal.compare_exchange_strong(zero_or_ignore, SIGKILL)) {
SleepForever();
}
#endif
StopBackendThread();
}
@@ -340,36 +268,6 @@ private:
delete ptr;
}
#ifdef YUZU_LINUX_GCC_BACKTRACE
[[noreturn]] static void HandleSignal(int sig) {
signal(SIGABRT, SIG_DFL);
signal(SIGSEGV, SIG_DFL);
if (sig <= 0) {
abort();
}
instance->InstanceHandleSignal(sig);
}
[[noreturn]] void InstanceHandleSignal(int sig) {
if (int zero_or_ignore = 0; !received_signal.compare_exchange_strong(zero_or_ignore, sig)) {
if (received_signal == SIGKILL) {
abort();
}
SleepForever();
}
// Don't restart like boost suggests. We want to append to the log file and not lose dynamic
// symbols. This may segfault if it unwinds outside C/C++ code but we'll just have to fall
// back to core dumps.
boost::stacktrace::safe_dump_to(backtrace_storage.data(), 4096);
std::atomic_thread_fence(std::memory_order_seq_cst);
for (const int anything = 0; write(backtrace_thread_waker_fd, &anything, 1) != 1;)
;
for (u8 ignore = 0; read(backtrace_done_printing_fd, &ignore, 1) != 1;)
;
abort();
}
#endif
static inline std::unique_ptr<Impl, decltype(&Deleter)> instance{nullptr, Deleter};
Filter filter;
@@ -380,13 +278,6 @@ private:
std::thread backend_thread;
MPSCQueue<Entry> message_queue{};
std::chrono::steady_clock::time_point time_origin{std::chrono::steady_clock::now()};
#ifdef YUZU_LINUX_GCC_BACKTRACE
std::atomic_int received_signal{0};
std::array<u8, 4096> backtrace_storage{};
int backtrace_thread_waker_fd;
int backtrace_done_printing_fd;
#endif
};
} // namespace

View File

@@ -111,6 +111,7 @@ bool ParseFilterRule(Filter& instance, Iterator begin, Iterator end) {
SUB(Service, NCM) \
SUB(Service, NFC) \
SUB(Service, NFP) \
SUB(Service, NGCT) \
SUB(Service, NIFM) \
SUB(Service, NIM) \
SUB(Service, NPNS) \

View File

@@ -81,6 +81,7 @@ enum class Class : u8 {
Service_NCM, ///< The NCM service
Service_NFC, ///< The NFC (Near-field communication) service
Service_NFP, ///< The NFP service
Service_NGCT, ///< The NGCT (No Good Content for Terra) service
Service_NIFM, ///< The NIFM (Network interface) service
Service_NIM, ///< The NIM service
Service_NPNS, ///< The NPNS service

140
src/common/lru_cache.h Normal file
View File

@@ -0,0 +1,140 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2+ or any later version
// Refer to the license.txt file included.
#pragma once
#include <deque>
#include <memory>
#include <type_traits>
#include "common/common_types.h"
namespace Common {
template <class Traits>
class LeastRecentlyUsedCache {
using ObjectType = typename Traits::ObjectType;
using TickType = typename Traits::TickType;
struct Item {
ObjectType obj;
TickType tick;
Item* next{};
Item* prev{};
};
public:
LeastRecentlyUsedCache() : first_item{}, last_item{} {}
~LeastRecentlyUsedCache() = default;
size_t Insert(ObjectType obj, TickType tick) {
const auto new_id = Build();
auto& item = item_pool[new_id];
item.obj = obj;
item.tick = tick;
Attach(item);
return new_id;
}
void Touch(size_t id, TickType tick) {
auto& item = item_pool[id];
if (item.tick >= tick) {
return;
}
item.tick = tick;
if (&item == last_item) {
return;
}
Detach(item);
Attach(item);
}
void Free(size_t id) {
auto& item = item_pool[id];
Detach(item);
item.prev = nullptr;
item.next = nullptr;
free_items.push_back(id);
}
template <typename Func>
void ForEachItemBelow(TickType tick, Func&& func) {
static constexpr bool RETURNS_BOOL =
std::is_same_v<std::invoke_result<Func, ObjectType>, bool>;
Item* iterator = first_item;
while (iterator) {
if (static_cast<s64>(tick) - static_cast<s64>(iterator->tick) < 0) {
return;
}
Item* next = iterator->next;
if constexpr (RETURNS_BOOL) {
if (func(iterator->obj)) {
return;
}
} else {
func(iterator->obj);
}
iterator = next;
}
}
private:
size_t Build() {
if (free_items.empty()) {
const size_t item_id = item_pool.size();
auto& item = item_pool.emplace_back();
item.next = nullptr;
item.prev = nullptr;
return item_id;
}
const size_t item_id = free_items.front();
free_items.pop_front();
auto& item = item_pool[item_id];
item.next = nullptr;
item.prev = nullptr;
return item_id;
}
void Attach(Item& item) {
if (!first_item) {
first_item = &item;
}
if (!last_item) {
last_item = &item;
} else {
item.prev = last_item;
last_item->next = &item;
item.next = nullptr;
last_item = &item;
}
}
void Detach(Item& item) {
if (item.prev) {
item.prev->next = item.next;
}
if (item.next) {
item.next->prev = item.prev;
}
if (&item == first_item) {
first_item = item.next;
if (first_item) {
first_item->prev = nullptr;
}
}
if (&item == last_item) {
last_item = item.prev;
if (last_item) {
last_item->next = nullptr;
}
}
}
std::deque<Item> item_pool;
std::deque<size_t> free_items;
Item* first_item{};
Item* last_item{};
};
} // namespace Common

View File

@@ -59,7 +59,6 @@ void LogSettings() {
log_setting("Renderer_UseVsync", values.use_vsync.GetValue());
log_setting("Renderer_ShaderBackend", values.shader_backend.GetValue());
log_setting("Renderer_UseAsynchronousShaders", values.use_asynchronous_shaders.GetValue());
log_setting("Renderer_UseGarbageCollection", values.use_caches_gc.GetValue());
log_setting("Renderer_AnisotropicFilteringLevel", values.max_anisotropy.GetValue());
log_setting("Audio_OutputEngine", values.sink_id.GetValue());
log_setting("Audio_EnableAudioStretching", values.enable_audio_stretching.GetValue());
@@ -143,7 +142,6 @@ void RestoreGlobalState(bool is_powered_on) {
values.shader_backend.SetGlobal(true);
values.use_asynchronous_shaders.SetGlobal(true);
values.use_fast_gpu_time.SetGlobal(true);
values.use_caches_gc.SetGlobal(true);
values.bg_red.SetGlobal(true);
values.bg_green.SetGlobal(true);
values.bg_blue.SetGlobal(true);

View File

@@ -475,7 +475,6 @@ struct Values {
ShaderBackend::SPIRV, "shader_backend"};
Setting<bool> use_asynchronous_shaders{false, "use_asynchronous_shaders"};
Setting<bool> use_fast_gpu_time{true, "use_fast_gpu_time"};
Setting<bool> use_caches_gc{false, "use_caches_gc"};
Setting<u8> bg_red{0, "bg_red"};
Setting<u8> bg_green{0, "bg_green"};

View File

@@ -452,6 +452,8 @@ add_library(core STATIC
hle/service/nfp/nfp.h
hle/service/nfp/nfp_user.cpp
hle/service/nfp/nfp_user.h
hle/service/ngct/ngct.cpp
hle/service/ngct/ngct.h
hle/service/nifm/nifm.cpp
hle/service/nifm/nifm.h
hle/service/nim/nim.cpp

View File

@@ -21,34 +21,25 @@ namespace Core {
CpuManager::CpuManager(System& system_) : system{system_} {}
CpuManager::~CpuManager() = default;
void CpuManager::ThreadStart(CpuManager& cpu_manager, std::size_t core) {
cpu_manager.RunThread(core);
void CpuManager::ThreadStart(std::stop_token stop_token, CpuManager& cpu_manager,
std::size_t core) {
cpu_manager.RunThread(stop_token, core);
}
void CpuManager::Initialize() {
running_mode = true;
if (is_multicore) {
for (std::size_t core = 0; core < Core::Hardware::NUM_CPU_CORES; core++) {
core_data[core].host_thread =
std::make_unique<std::thread>(ThreadStart, std::ref(*this), core);
core_data[core].host_thread = std::jthread(ThreadStart, std::ref(*this), core);
}
} else {
core_data[0].host_thread = std::make_unique<std::thread>(ThreadStart, std::ref(*this), 0);
core_data[0].host_thread = std::jthread(ThreadStart, std::ref(*this), 0);
}
}
void CpuManager::Shutdown() {
running_mode = false;
Pause(false);
if (is_multicore) {
for (auto& data : core_data) {
data.host_thread->join();
data.host_thread.reset();
}
} else {
core_data[0].host_thread->join();
core_data[0].host_thread.reset();
}
}
std::function<void(void*)> CpuManager::GetGuestThreadStartFunc() {
@@ -317,7 +308,7 @@ void CpuManager::Pause(bool paused) {
}
}
void CpuManager::RunThread(std::size_t core) {
void CpuManager::RunThread(std::stop_token stop_token, std::size_t core) {
/// Initialization
system.RegisterCoreThread(core);
std::string name;
@@ -361,6 +352,10 @@ void CpuManager::RunThread(std::size_t core) {
return;
}
if (stop_token.stop_requested()) {
break;
}
auto current_thread = system.Kernel().CurrentScheduler()->GetCurrentThread();
data.is_running = true;
Common::Fiber::YieldTo(data.host_context, *current_thread->GetHostContext());

View File

@@ -78,9 +78,9 @@ private:
void SingleCoreRunSuspendThread();
void SingleCorePause(bool paused);
static void ThreadStart(CpuManager& cpu_manager, std::size_t core);
static void ThreadStart(std::stop_token stop_token, CpuManager& cpu_manager, std::size_t core);
void RunThread(std::size_t core);
void RunThread(std::stop_token stop_token, std::size_t core);
struct CoreData {
std::shared_ptr<Common::Fiber> host_context;
@@ -89,7 +89,7 @@ private:
std::atomic<bool> is_running;
std::atomic<bool> is_paused;
std::atomic<bool> initialized;
std::unique_ptr<std::thread> host_thread;
std::jthread host_thread;
};
std::atomic<bool> running_mode{};

View File

@@ -28,13 +28,20 @@ constexpr char DISPLAY_TITLE[] = "NintendoSDK Firmware for NX 12.1.0-1.0";
// Atmosphere version constants.
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MAJOR = 0;
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MINOR = 19;
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MICRO = 5;
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MAJOR = 1;
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MINOR = 0;
constexpr u8 ATMOSPHERE_RELEASE_VERSION_MICRO = 0;
constexpr u32 AtmosphereTargetFirmwareWithRevision(u8 major, u8 minor, u8 micro, u8 rev) {
return u32{major} << 24 | u32{minor} << 16 | u32{micro} << 8 | u32{rev};
}
constexpr u32 AtmosphereTargetFirmware(u8 major, u8 minor, u8 micro) {
return AtmosphereTargetFirmwareWithRevision(major, minor, micro, 0);
}
constexpr u32 GetTargetFirmware() {
return u32{HOS_VERSION_MAJOR} << 24 | u32{HOS_VERSION_MINOR} << 16 |
u32{HOS_VERSION_MICRO} << 8 | 0U;
return AtmosphereTargetFirmware(HOS_VERSION_MAJOR, HOS_VERSION_MINOR, HOS_VERSION_MICRO);
}
} // namespace HLE::ApiVersion

View File

@@ -0,0 +1,46 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included
#include "common/string_util.h"
#include "core/core.h"
#include "core/hle/ipc_helpers.h"
#include "core/hle/service/ngct/ngct.h"
#include "core/hle/service/service.h"
namespace Service::NGCT {
class IService final : public ServiceFramework<IService> {
public:
explicit IService(Core::System& system_) : ServiceFramework{system_, "ngct:u"} {
// clang-format off
static const FunctionInfo functions[] = {
{0, nullptr, "Match"},
{1, &IService::Filter, "Filter"},
};
// clang-format on
RegisterHandlers(functions);
}
private:
void Filter(Kernel::HLERequestContext& ctx) {
const auto buffer = ctx.ReadBuffer();
const auto text = Common::StringFromFixedZeroTerminatedBuffer(
reinterpret_cast<const char*>(buffer.data()), buffer.size());
LOG_WARNING(Service_NGCT, "(STUBBED) called, text={}", text);
// Return the same string since we don't censor anything
ctx.WriteBuffer(buffer);
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(ResultSuccess);
}
};
void InstallInterfaces(SM::ServiceManager& service_manager, Core::System& system) {
std::make_shared<IService>(system)->InstallAsService(system.ServiceManager());
}
} // namespace Service::NGCT

View File

@@ -0,0 +1,20 @@
// Copyright 2021 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included
#pragma once
namespace Core {
class System;
}
namespace Service::SM {
class ServiceManager;
}
namespace Service::NGCT {
/// Registers all NGCT services with the specified service manager.
void InstallInterfaces(SM::ServiceManager& service_manager, Core::System& system);
} // namespace Service::NGCT

View File

@@ -9,17 +9,20 @@
#include "core/core.h"
#include "core/hle/kernel/k_writable_event.h"
#include "core/hle/kernel/kernel.h"
#include "core/hle/service/kernel_helpers.h"
#include "core/hle/service/nvflinger/buffer_queue.h"
namespace Service::NVFlinger {
BufferQueue::BufferQueue(Kernel::KernelCore& kernel, u32 id_, u64 layer_id_)
: id(id_), layer_id(layer_id_), buffer_wait_event{kernel} {
Kernel::KAutoObject::Create(std::addressof(buffer_wait_event));
buffer_wait_event.Initialize("BufferQueue:WaitEvent");
BufferQueue::BufferQueue(Kernel::KernelCore& kernel, u32 id_, u64 layer_id_,
KernelHelpers::ServiceContext& service_context_)
: id(id_), layer_id(layer_id_), service_context{service_context_} {
buffer_wait_event = service_context.CreateEvent("BufferQueue:WaitEvent");
}
BufferQueue::~BufferQueue() = default;
BufferQueue::~BufferQueue() {
service_context.CloseEvent(buffer_wait_event);
}
void BufferQueue::SetPreallocatedBuffer(u32 slot, const IGBPBuffer& igbp_buffer) {
ASSERT(slot < buffer_slots);
@@ -41,7 +44,7 @@ void BufferQueue::SetPreallocatedBuffer(u32 slot, const IGBPBuffer& igbp_buffer)
.multi_fence = {},
};
buffer_wait_event.GetWritableEvent().Signal();
buffer_wait_event->GetWritableEvent().Signal();
}
std::optional<std::pair<u32, Service::Nvidia::MultiFence*>> BufferQueue::DequeueBuffer(u32 width,
@@ -119,7 +122,7 @@ void BufferQueue::CancelBuffer(u32 slot, const Service::Nvidia::MultiFence& mult
}
free_buffers_condition.notify_one();
buffer_wait_event.GetWritableEvent().Signal();
buffer_wait_event->GetWritableEvent().Signal();
}
std::optional<std::reference_wrapper<const BufferQueue::Buffer>> BufferQueue::AcquireBuffer() {
@@ -154,7 +157,7 @@ void BufferQueue::ReleaseBuffer(u32 slot) {
}
free_buffers_condition.notify_one();
buffer_wait_event.GetWritableEvent().Signal();
buffer_wait_event->GetWritableEvent().Signal();
}
void BufferQueue::Connect() {
@@ -169,7 +172,7 @@ void BufferQueue::Disconnect() {
std::unique_lock lock{queue_sequence_mutex};
queue_sequence.clear();
}
buffer_wait_event.GetWritableEvent().Signal();
buffer_wait_event->GetWritableEvent().Signal();
is_connect = false;
free_buffers_condition.notify_one();
}
@@ -189,11 +192,11 @@ u32 BufferQueue::Query(QueryType type) {
}
Kernel::KWritableEvent& BufferQueue::GetWritableBufferWaitEvent() {
return buffer_wait_event.GetWritableEvent();
return buffer_wait_event->GetWritableEvent();
}
Kernel::KReadableEvent& BufferQueue::GetBufferWaitEvent() {
return buffer_wait_event.GetReadableEvent();
return buffer_wait_event->GetReadableEvent();
}
} // namespace Service::NVFlinger

View File

@@ -24,6 +24,10 @@ class KReadableEvent;
class KWritableEvent;
} // namespace Kernel
namespace Service::KernelHelpers {
class ServiceContext;
} // namespace Service::KernelHelpers
namespace Service::NVFlinger {
constexpr u32 buffer_slots = 0x40;
@@ -54,7 +58,8 @@ public:
NativeWindowFormat = 2,
};
explicit BufferQueue(Kernel::KernelCore& kernel, u32 id_, u64 layer_id_);
explicit BufferQueue(Kernel::KernelCore& kernel, u32 id_, u64 layer_id_,
KernelHelpers::ServiceContext& service_context_);
~BufferQueue();
enum class BufferTransformFlags : u32 {
@@ -130,12 +135,14 @@ private:
std::list<u32> free_buffers;
std::array<Buffer, buffer_slots> buffers;
std::list<u32> queue_sequence;
Kernel::KEvent buffer_wait_event;
Kernel::KEvent* buffer_wait_event{};
std::mutex free_buffers_mutex;
std::condition_variable free_buffers_condition;
std::mutex queue_sequence_mutex;
KernelHelpers::ServiceContext& service_context;
};
} // namespace Service::NVFlinger

View File

@@ -61,12 +61,13 @@ void NVFlinger::SplitVSync() {
}
}
NVFlinger::NVFlinger(Core::System& system_) : system(system_) {
displays.emplace_back(0, "Default", system);
displays.emplace_back(1, "External", system);
displays.emplace_back(2, "Edid", system);
displays.emplace_back(3, "Internal", system);
displays.emplace_back(4, "Null", system);
NVFlinger::NVFlinger(Core::System& system_)
: system(system_), service_context(system_, "nvflinger") {
displays.emplace_back(0, "Default", service_context, system);
displays.emplace_back(1, "External", service_context, system);
displays.emplace_back(2, "Edid", service_context, system);
displays.emplace_back(3, "Internal", service_context, system);
displays.emplace_back(4, "Null", service_context, system);
guard = std::make_shared<std::mutex>();
// Schedule the screen composition events
@@ -146,7 +147,7 @@ std::optional<u64> NVFlinger::CreateLayer(u64 display_id) {
void NVFlinger::CreateLayerAtId(VI::Display& display, u64 layer_id) {
const u32 buffer_queue_id = next_buffer_queue_id++;
buffer_queues.emplace_back(
std::make_unique<BufferQueue>(system.Kernel(), buffer_queue_id, layer_id));
std::make_unique<BufferQueue>(system.Kernel(), buffer_queue_id, layer_id, service_context));
display.CreateLayer(layer_id, *buffer_queues.back());
}

View File

@@ -15,6 +15,7 @@
#include <vector>
#include "common/common_types.h"
#include "core/hle/service/kernel_helpers.h"
namespace Common {
class Event;
@@ -135,6 +136,8 @@ private:
std::unique_ptr<std::thread> vsync_thread;
std::unique_ptr<Common::Event> wait_event;
std::atomic<bool> is_running{};
KernelHelpers::ServiceContext service_context;
};
} // namespace Service::NVFlinger

View File

@@ -46,6 +46,7 @@
#include "core/hle/service/ncm/ncm.h"
#include "core/hle/service/nfc/nfc.h"
#include "core/hle/service/nfp/nfp.h"
#include "core/hle/service/ngct/ngct.h"
#include "core/hle/service/nifm/nifm.h"
#include "core/hle/service/nim/nim.h"
#include "core/hle/service/npns/npns.h"
@@ -271,6 +272,7 @@ Services::Services(std::shared_ptr<SM::ServiceManager>& sm, Core::System& system
NCM::InstallInterfaces(*sm, system);
NFC::InstallInterfaces(*sm, system);
NFP::InstallInterfaces(*sm, system);
NGCT::InstallInterfaces(*sm, system);
NIFM::InstallInterfaces(*sm, system);
NIM::InstallInterfaces(*sm, system);
NPNS::InstallInterfaces(*sm, system);

View File

@@ -12,18 +12,21 @@
#include "core/hle/kernel/k_event.h"
#include "core/hle/kernel/k_readable_event.h"
#include "core/hle/kernel/k_writable_event.h"
#include "core/hle/service/kernel_helpers.h"
#include "core/hle/service/vi/display/vi_display.h"
#include "core/hle/service/vi/layer/vi_layer.h"
namespace Service::VI {
Display::Display(u64 id, std::string name_, Core::System& system)
: display_id{id}, name{std::move(name_)}, vsync_event{system.Kernel()} {
Kernel::KAutoObject::Create(std::addressof(vsync_event));
vsync_event.Initialize(fmt::format("Display VSync Event {}", id));
Display::Display(u64 id, std::string name_, KernelHelpers::ServiceContext& service_context_,
Core::System& system_)
: display_id{id}, name{std::move(name_)}, service_context{service_context_} {
vsync_event = service_context.CreateEvent(fmt::format("Display VSync Event {}", id));
}
Display::~Display() = default;
Display::~Display() {
service_context.CloseEvent(vsync_event);
}
Layer& Display::GetLayer(std::size_t index) {
return *layers.at(index);
@@ -34,11 +37,11 @@ const Layer& Display::GetLayer(std::size_t index) const {
}
Kernel::KReadableEvent& Display::GetVSyncEvent() {
return vsync_event.GetReadableEvent();
return vsync_event->GetReadableEvent();
}
void Display::SignalVSyncEvent() {
vsync_event.GetWritableEvent().Signal();
vsync_event->GetWritableEvent().Signal();
}
void Display::CreateLayer(u64 layer_id, NVFlinger::BufferQueue& buffer_queue) {

View File

@@ -18,6 +18,9 @@ class KEvent;
namespace Service::NVFlinger {
class BufferQueue;
}
namespace Service::KernelHelpers {
class ServiceContext;
} // namespace Service::KernelHelpers
namespace Service::VI {
@@ -31,10 +34,13 @@ class Display {
public:
/// Constructs a display with a given unique ID and name.
///
/// @param id The unique ID for this display.
/// @param id The unique ID for this display.
/// @param service_context_ The ServiceContext for the owning service.
/// @param name_ The name for this display.
/// @param system_ The global system instance.
///
Display(u64 id, std::string name_, Core::System& system);
Display(u64 id, std::string name_, KernelHelpers::ServiceContext& service_context_,
Core::System& system_);
~Display();
/// Gets the unique ID assigned to this display.
@@ -98,9 +104,10 @@ public:
private:
u64 display_id;
std::string name;
KernelHelpers::ServiceContext& service_context;
std::vector<std::shared_ptr<Layer>> layers;
Kernel::KEvent vsync_event;
Kernel::KEvent* vsync_event{};
};
} // namespace Service::VI

View File

@@ -15,6 +15,8 @@
namespace Shader::Backend::SPIRV {
namespace {
constexpr size_t NUM_FIXEDFNCTEXTURE = 10;
enum class Operation {
Increment,
Decrement,
@@ -427,6 +429,16 @@ Id DescType(EmitContext& ctx, Id sampled_type, Id pointer_type, u32 count) {
return pointer_type;
}
}
size_t FindNextUnusedLocation(const std::bitset<IR::NUM_GENERICS>& used_locations,
size_t start_offset) {
for (size_t location = start_offset; location < used_locations.size(); ++location) {
if (!used_locations.test(location)) {
return location;
}
}
throw RuntimeError("Unable to get an unused location for legacy attribute");
}
} // Anonymous namespace
void VectorTypes::Define(Sirit::Module& sirit_ctx, Id base_type, std::string_view name) {
@@ -1227,6 +1239,7 @@ void EmitContext::DefineInputs(const IR::Program& program) {
loads[IR::Attribute::TessellationEvaluationPointV]) {
tess_coord = DefineInput(*this, F32[3], false, spv::BuiltIn::TessCoord);
}
std::bitset<IR::NUM_GENERICS> used_locations{};
for (size_t index = 0; index < IR::NUM_GENERICS; ++index) {
const AttributeType input_type{runtime_info.generic_input_types[index]};
if (!runtime_info.previous_stage_stores.Generic(index)) {
@@ -1238,6 +1251,7 @@ void EmitContext::DefineInputs(const IR::Program& program) {
if (input_type == AttributeType::Disabled) {
continue;
}
used_locations.set(index);
const Id type{GetAttributeType(*this, input_type)};
const Id id{DefineInput(*this, type, true)};
Decorate(id, spv::Decoration::Location, static_cast<u32>(index));
@@ -1263,6 +1277,26 @@ void EmitContext::DefineInputs(const IR::Program& program) {
break;
}
}
size_t previous_unused_location = 0;
if (loads.AnyComponent(IR::Attribute::ColorFrontDiffuseR)) {
const size_t location = FindNextUnusedLocation(used_locations, previous_unused_location);
previous_unused_location = location;
used_locations.set(location);
const Id id{DefineInput(*this, F32[4], true)};
Decorate(id, spv::Decoration::Location, location);
input_front_color = id;
}
for (size_t index = 0; index < NUM_FIXEDFNCTEXTURE; ++index) {
if (loads.AnyComponent(IR::Attribute::FixedFncTexture0S + index * 4)) {
const size_t location =
FindNextUnusedLocation(used_locations, previous_unused_location);
previous_unused_location = location;
used_locations.set(location);
const Id id{DefineInput(*this, F32[4], true)};
Decorate(id, spv::Decoration::Location, location);
input_fixed_fnc_textures[index] = id;
}
}
if (stage == Stage::TessellationEval) {
for (size_t index = 0; index < info.uses_patches.size(); ++index) {
if (!info.uses_patches[index]) {
@@ -1313,9 +1347,31 @@ void EmitContext::DefineOutputs(const IR::Program& program) {
viewport_mask = DefineOutput(*this, TypeArray(U32[1], Const(1u)), std::nullopt,
spv::BuiltIn::ViewportMaskNV);
}
std::bitset<IR::NUM_GENERICS> used_locations{};
for (size_t index = 0; index < IR::NUM_GENERICS; ++index) {
if (info.stores.Generic(index)) {
DefineGenericOutput(*this, index, invocations);
used_locations.set(index);
}
}
size_t previous_unused_location = 0;
if (info.stores.AnyComponent(IR::Attribute::ColorFrontDiffuseR)) {
const size_t location = FindNextUnusedLocation(used_locations, previous_unused_location);
previous_unused_location = location;
used_locations.set(location);
const Id id{DefineOutput(*this, F32[4], invocations)};
Decorate(id, spv::Decoration::Location, static_cast<u32>(location));
output_front_color = id;
}
for (size_t index = 0; index < NUM_FIXEDFNCTEXTURE; ++index) {
if (info.stores.AnyComponent(IR::Attribute::FixedFncTexture0S + index * 4)) {
const size_t location =
FindNextUnusedLocation(used_locations, previous_unused_location);
previous_unused_location = location;
used_locations.set(location);
const Id id{DefineOutput(*this, F32[4], invocations)};
Decorate(id, spv::Decoration::Location, location);
output_fixed_fnc_textures[index] = id;
}
}
switch (stage) {

View File

@@ -268,10 +268,14 @@ public:
Id write_global_func_u32x4{};
Id input_position{};
Id input_front_color{};
std::array<Id, 10> input_fixed_fnc_textures{};
std::array<Id, 32> input_generics{};
Id output_point_size{};
Id output_position{};
Id output_front_color{};
std::array<Id, 10> output_fixed_fnc_textures{};
std::array<std::array<GenericElementInfo, 4>, 32> output_generics{};
Id output_tess_level_outer{};

View File

@@ -43,6 +43,25 @@ Id AttrPointer(EmitContext& ctx, Id pointer_type, Id vertex, Id base, Args&&...
}
}
bool IsFixedFncTexture(IR::Attribute attribute) {
return attribute >= IR::Attribute::FixedFncTexture0S &&
attribute <= IR::Attribute::FixedFncTexture9Q;
}
u32 FixedFncTextureAttributeIndex(IR::Attribute attribute) {
if (!IsFixedFncTexture(attribute)) {
throw InvalidArgument("Attribute {} is not a FixedFncTexture", attribute);
}
return (static_cast<u32>(attribute) - static_cast<u32>(IR::Attribute::FixedFncTexture0S)) / 4u;
}
u32 FixedFncTextureAttributeElement(IR::Attribute attribute) {
if (!IsFixedFncTexture(attribute)) {
throw InvalidArgument("Attribute {} is not a FixedFncTexture", attribute);
}
return static_cast<u32>(attribute) % 4u;
}
template <typename... Args>
Id OutputAccessChain(EmitContext& ctx, Id result_type, Id base, Args&&... args) {
if (ctx.stage == Stage::TessellationControl) {
@@ -74,6 +93,13 @@ std::optional<OutAttr> OutputAttrPointer(EmitContext& ctx, IR::Attribute attr) {
return OutputAccessChain(ctx, ctx.output_f32, info.id, index_id);
}
}
if (IsFixedFncTexture(attr)) {
const u32 index{FixedFncTextureAttributeIndex(attr)};
const u32 element{FixedFncTextureAttributeElement(attr)};
const Id element_id{ctx.Const(element)};
return OutputAccessChain(ctx, ctx.output_f32, ctx.output_fixed_fnc_textures[index],
element_id);
}
switch (attr) {
case IR::Attribute::PointSize:
return ctx.output_point_size;
@@ -85,6 +111,14 @@ std::optional<OutAttr> OutputAttrPointer(EmitContext& ctx, IR::Attribute attr) {
const Id element_id{ctx.Const(element)};
return OutputAccessChain(ctx, ctx.output_f32, ctx.output_position, element_id);
}
case IR::Attribute::ColorFrontDiffuseR:
case IR::Attribute::ColorFrontDiffuseG:
case IR::Attribute::ColorFrontDiffuseB:
case IR::Attribute::ColorFrontDiffuseA: {
const u32 element{static_cast<u32>(attr) % 4};
const Id element_id{ctx.Const(element)};
return OutputAccessChain(ctx, ctx.output_f32, ctx.output_front_color, element_id);
}
case IR::Attribute::ClipDistance0:
case IR::Attribute::ClipDistance1:
case IR::Attribute::ClipDistance2:
@@ -307,6 +341,12 @@ Id EmitGetAttribute(EmitContext& ctx, IR::Attribute attr, Id vertex) {
const Id value{ctx.OpLoad(type->id, pointer)};
return type->needs_cast ? ctx.OpBitcast(ctx.F32[1], value) : value;
}
if (IsFixedFncTexture(attr)) {
const u32 index{FixedFncTextureAttributeIndex(attr)};
const Id attr_id{ctx.input_fixed_fnc_textures[index]};
const Id attr_ptr{AttrPointer(ctx, ctx.input_f32, vertex, attr_id, ctx.Const(element))};
return ctx.OpLoad(ctx.F32[1], attr_ptr);
}
switch (attr) {
case IR::Attribute::PrimitiveId:
return ctx.OpBitcast(ctx.F32[1], ctx.OpLoad(ctx.U32[1], ctx.primitive_id));
@@ -316,6 +356,13 @@ Id EmitGetAttribute(EmitContext& ctx, IR::Attribute attr, Id vertex) {
case IR::Attribute::PositionW:
return ctx.OpLoad(ctx.F32[1], AttrPointer(ctx, ctx.input_f32, vertex, ctx.input_position,
ctx.Const(element)));
case IR::Attribute::ColorFrontDiffuseR:
case IR::Attribute::ColorFrontDiffuseG:
case IR::Attribute::ColorFrontDiffuseB:
case IR::Attribute::ColorFrontDiffuseA: {
return ctx.OpLoad(ctx.F32[1], AttrPointer(ctx, ctx.input_f32, vertex, ctx.input_front_color,
ctx.Const(element)));
}
case IR::Attribute::InstanceId:
if (ctx.profile.support_vertex_instance_id) {
return ctx.OpBitcast(ctx.F32[1], ctx.OpLoad(ctx.U32[1], ctx.instance_id));
@@ -333,8 +380,9 @@ Id EmitGetAttribute(EmitContext& ctx, IR::Attribute attr, Id vertex) {
return ctx.OpBitcast(ctx.F32[1], ctx.OpISub(ctx.U32[1], index, base));
}
case IR::Attribute::FrontFace:
return ctx.OpSelect(ctx.U32[1], ctx.OpLoad(ctx.U1, ctx.front_face),
ctx.Const(std::numeric_limits<u32>::max()), ctx.u32_zero_value);
return ctx.OpSelect(ctx.F32[1], ctx.OpLoad(ctx.U1, ctx.front_face),
ctx.OpBitcast(ctx.F32[1], ctx.Const(std::numeric_limits<u32>::max())),
ctx.f32_zero_value);
case IR::Attribute::PointSpriteS:
return ctx.OpLoad(ctx.F32[1],
ctx.OpAccessChain(ctx.input_f32, ctx.point_coord, ctx.u32_zero_value));

View File

@@ -20,6 +20,7 @@
#include "shader_recompiler/frontend/maxwell/decode.h"
#include "shader_recompiler/frontend/maxwell/structured_control_flow.h"
#include "shader_recompiler/frontend/maxwell/translate/translate.h"
#include "shader_recompiler/host_translate_info.h"
#include "shader_recompiler/object_pool.h"
namespace Shader::Maxwell {
@@ -652,7 +653,7 @@ class TranslatePass {
public:
TranslatePass(ObjectPool<IR::Inst>& inst_pool_, ObjectPool<IR::Block>& block_pool_,
ObjectPool<Statement>& stmt_pool_, Environment& env_, Statement& root_stmt,
IR::AbstractSyntaxList& syntax_list_)
IR::AbstractSyntaxList& syntax_list_, const HostTranslateInfo& host_info)
: stmt_pool{stmt_pool_}, inst_pool{inst_pool_}, block_pool{block_pool_}, env{env_},
syntax_list{syntax_list_} {
Visit(root_stmt, nullptr, nullptr);
@@ -660,6 +661,9 @@ public:
IR::Block& first_block{*syntax_list.front().data.block};
IR::IREmitter ir(first_block, first_block.begin());
ir.Prologue();
if (uses_demote_to_helper && host_info.needs_demote_reorder) {
DemoteCombinationPass();
}
}
private:
@@ -809,7 +813,14 @@ private:
}
case StatementType::Return: {
ensure_block();
IR::IREmitter{*current_block}.Epilogue();
IR::Block* return_block{block_pool.Create(inst_pool)};
IR::IREmitter{*return_block}.Epilogue();
current_block->AddBranch(return_block);
auto& merge{syntax_list.emplace_back()};
merge.type = IR::AbstractSyntaxNode::Type::Block;
merge.data.block = return_block;
current_block = nullptr;
syntax_list.emplace_back().type = IR::AbstractSyntaxNode::Type::Return;
break;
@@ -824,6 +835,7 @@ private:
auto& merge{syntax_list.emplace_back()};
merge.type = IR::AbstractSyntaxNode::Type::Block;
merge.data.block = demote_block;
uses_demote_to_helper = true;
break;
}
case StatementType::Unreachable: {
@@ -855,11 +867,117 @@ private:
return block_pool.Create(inst_pool);
}
void DemoteCombinationPass() {
using Type = IR::AbstractSyntaxNode::Type;
std::vector<IR::Block*> demote_blocks;
std::vector<IR::U1> demote_conds;
u32 num_epilogues{};
u32 branch_depth{};
for (const IR::AbstractSyntaxNode& node : syntax_list) {
if (node.type == Type::If) {
++branch_depth;
}
if (node.type == Type::EndIf) {
--branch_depth;
}
if (node.type != Type::Block) {
continue;
}
if (branch_depth > 1) {
// Skip reordering nested demote branches.
continue;
}
for (const IR::Inst& inst : node.data.block->Instructions()) {
const IR::Opcode op{inst.GetOpcode()};
if (op == IR::Opcode::DemoteToHelperInvocation) {
demote_blocks.push_back(node.data.block);
break;
}
if (op == IR::Opcode::Epilogue) {
++num_epilogues;
}
}
}
if (demote_blocks.size() == 0) {
return;
}
if (num_epilogues > 1) {
LOG_DEBUG(Shader, "Combining demotes with more than one return is not implemented.");
return;
}
s64 last_iterator_offset{};
auto& asl{syntax_list};
for (const IR::Block* demote_block : demote_blocks) {
const auto start_it{asl.begin() + last_iterator_offset};
auto asl_it{std::find_if(start_it, asl.end(), [&](const IR::AbstractSyntaxNode& asn) {
return asn.type == Type::If && asn.data.if_node.body == demote_block;
})};
if (asl_it == asl.end()) {
// Demote without a conditional branch.
// No need to proceed since all fragment instances will be demoted regardless.
return;
}
const IR::Block* const end_if = asl_it->data.if_node.merge;
demote_conds.push_back(asl_it->data.if_node.cond);
last_iterator_offset = std::distance(asl.begin(), asl_it);
asl_it = asl.erase(asl_it);
asl_it = std::find_if(asl_it, asl.end(), [&](const IR::AbstractSyntaxNode& asn) {
return asn.type == Type::Block && asn.data.block == demote_block;
});
asl_it = asl.erase(asl_it);
asl_it = std::find_if(asl_it, asl.end(), [&](const IR::AbstractSyntaxNode& asn) {
return asn.type == Type::EndIf && asn.data.end_if.merge == end_if;
});
asl_it = asl.erase(asl_it);
}
const auto epilogue_func{[](const IR::AbstractSyntaxNode& asn) {
if (asn.type != Type::Block) {
return false;
}
for (const auto& inst : asn.data.block->Instructions()) {
if (inst.GetOpcode() == IR::Opcode::Epilogue) {
return true;
}
}
return false;
}};
const auto reverse_it{std::find_if(asl.rbegin(), asl.rend(), epilogue_func)};
const auto return_block_it{(reverse_it + 1).base()};
IR::IREmitter ir{*(return_block_it - 1)->data.block};
IR::U1 cond(IR::Value(false));
for (const auto& demote_cond : demote_conds) {
cond = ir.LogicalOr(cond, demote_cond);
}
cond.Inst()->DestructiveAddUsage(1);
IR::AbstractSyntaxNode demote_if_node{};
demote_if_node.type = Type::If;
demote_if_node.data.if_node.cond = cond;
demote_if_node.data.if_node.body = demote_blocks[0];
demote_if_node.data.if_node.merge = return_block_it->data.block;
IR::AbstractSyntaxNode demote_node{};
demote_node.type = Type::Block;
demote_node.data.block = demote_blocks[0];
IR::AbstractSyntaxNode demote_endif_node{};
demote_endif_node.type = Type::EndIf;
demote_endif_node.data.end_if.merge = return_block_it->data.block;
asl.insert(return_block_it, demote_endif_node);
asl.insert(return_block_it, demote_node);
asl.insert(return_block_it, demote_if_node);
}
ObjectPool<Statement>& stmt_pool;
ObjectPool<IR::Inst>& inst_pool;
ObjectPool<IR::Block>& block_pool;
Environment& env;
IR::AbstractSyntaxList& syntax_list;
bool uses_demote_to_helper{};
// TODO: C++20 Remove this when all compilers support constexpr std::vector
#if __cpp_lib_constexpr_vector >= 201907
@@ -871,12 +989,13 @@ private:
} // Anonymous namespace
IR::AbstractSyntaxList BuildASL(ObjectPool<IR::Inst>& inst_pool, ObjectPool<IR::Block>& block_pool,
Environment& env, Flow::CFG& cfg) {
Environment& env, Flow::CFG& cfg,
const HostTranslateInfo& host_info) {
ObjectPool<Statement> stmt_pool{64};
GotoPass goto_pass{cfg, stmt_pool};
Statement& root{goto_pass.RootStatement()};
IR::AbstractSyntaxList syntax_list;
TranslatePass{inst_pool, block_pool, stmt_pool, env, root, syntax_list};
TranslatePass{inst_pool, block_pool, stmt_pool, env, root, syntax_list, host_info};
return syntax_list;
}

View File

@@ -11,10 +11,13 @@
#include "shader_recompiler/frontend/maxwell/control_flow.h"
#include "shader_recompiler/object_pool.h"
namespace Shader::Maxwell {
namespace Shader {
struct HostTranslateInfo;
namespace Maxwell {
[[nodiscard]] IR::AbstractSyntaxList BuildASL(ObjectPool<IR::Inst>& inst_pool,
ObjectPool<IR::Block>& block_pool, Environment& env,
Flow::CFG& cfg);
Flow::CFG& cfg, const HostTranslateInfo& host_info);
} // namespace Shader::Maxwell
} // namespace Maxwell
} // namespace Shader

View File

@@ -130,7 +130,7 @@ void AddNVNStorageBuffers(IR::Program& program) {
IR::Program TranslateProgram(ObjectPool<IR::Inst>& inst_pool, ObjectPool<IR::Block>& block_pool,
Environment& env, Flow::CFG& cfg, const HostTranslateInfo& host_info) {
IR::Program program;
program.syntax_list = BuildASL(inst_pool, block_pool, env, cfg);
program.syntax_list = BuildASL(inst_pool, block_pool, env, cfg, host_info);
program.blocks = GenerateBlocks(program.syntax_list);
program.post_order_blocks = PostOrder(program.syntax_list.front());
program.stage = env.ShaderStage();

View File

@@ -11,8 +11,9 @@ namespace Shader {
/// Misc information about the host
struct HostTranslateInfo {
bool support_float16{}; ///< True when the device supports 16-bit floats
bool support_int64{}; ///< True when the device supports 64-bit integers
bool support_float16{}; ///< True when the device supports 16-bit floats
bool support_int64{}; ///< True when the device supports 64-bit integers
bool needs_demote_reorder{}; ///< True when the device needs DemoteToHelperInvocation reordered
};
} // namespace Shader

View File

@@ -261,16 +261,6 @@ public:
stream_score += score;
}
/// Sets the new frame tick
void SetFrameTick(u64 new_frame_tick) noexcept {
frame_tick = new_frame_tick;
}
/// Returns the new frame tick
[[nodiscard]] u64 FrameTick() const noexcept {
return frame_tick;
}
/// Returns the likeliness of this being a stream buffer
[[nodiscard]] int StreamScore() const noexcept {
return stream_score;
@@ -307,6 +297,14 @@ public:
return words.size_bytes;
}
size_t getLRUID() const noexcept {
return lru_id;
}
void setLRUID(size_t lru_id_) {
lru_id = lru_id_;
}
private:
template <Type type>
u64* Array() noexcept {
@@ -603,9 +601,9 @@ private:
RasterizerInterface* rasterizer = nullptr;
VAddr cpu_addr = 0;
Words words;
u64 frame_tick = 0;
BufferFlagBits flags{};
int stream_score = 0;
size_t lru_id = SIZE_MAX;
};
} // namespace VideoCommon

View File

@@ -20,6 +20,7 @@
#include "common/common_types.h"
#include "common/div_ceil.h"
#include "common/literals.h"
#include "common/lru_cache.h"
#include "common/microprofile.h"
#include "common/scope_exit.h"
#include "common/settings.h"
@@ -330,7 +331,7 @@ private:
template <bool insert>
void ChangeRegister(BufferId buffer_id);
void TouchBuffer(Buffer& buffer) const noexcept;
void TouchBuffer(Buffer& buffer, BufferId buffer_id) noexcept;
bool SynchronizeBuffer(Buffer& buffer, VAddr cpu_addr, u32 size);
@@ -428,7 +429,11 @@ private:
size_t immediate_buffer_capacity = 0;
std::unique_ptr<u8[]> immediate_buffer_alloc;
typename SlotVector<Buffer>::Iterator deletion_iterator;
struct LRUItemParams {
using ObjectType = BufferId;
using TickType = u64;
};
Common::LeastRecentlyUsedCache<LRUItemParams> lru_cache;
u64 frame_tick = 0;
u64 total_used_memory = 0;
@@ -445,7 +450,6 @@ BufferCache<P>::BufferCache(VideoCore::RasterizerInterface& rasterizer_,
kepler_compute{kepler_compute_}, gpu_memory{gpu_memory_}, cpu_memory{cpu_memory_} {
// Ensure the first slot is used for the null buffer
void(slot_buffers.insert(runtime, NullBufferParams{}));
deletion_iterator = slot_buffers.end();
common_ranges.clear();
}
@@ -454,20 +458,17 @@ void BufferCache<P>::RunGarbageCollector() {
const bool aggressive_gc = total_used_memory >= CRITICAL_MEMORY;
const u64 ticks_to_destroy = aggressive_gc ? 60 : 120;
int num_iterations = aggressive_gc ? 64 : 32;
for (; num_iterations > 0; --num_iterations) {
if (deletion_iterator == slot_buffers.end()) {
deletion_iterator = slot_buffers.begin();
const auto clean_up = [this, &num_iterations](BufferId buffer_id) {
if (num_iterations == 0) {
return true;
}
++deletion_iterator;
if (deletion_iterator == slot_buffers.end()) {
break;
}
const auto [buffer_id, buffer] = *deletion_iterator;
if (buffer->FrameTick() + ticks_to_destroy < frame_tick) {
DownloadBufferMemory(*buffer);
DeleteBuffer(buffer_id);
}
}
--num_iterations;
auto& buffer = slot_buffers[buffer_id];
DownloadBufferMemory(buffer);
DeleteBuffer(buffer_id);
return false;
};
lru_cache.ForEachItemBelow(frame_tick - ticks_to_destroy, clean_up);
}
template <class P>
@@ -485,7 +486,7 @@ void BufferCache<P>::TickFrame() {
const bool skip_preferred = hits * 256 < shots * 251;
uniform_buffer_skip_cache_size = skip_preferred ? DEFAULT_SKIP_CACHE_SIZE : 0;
if (Settings::values.use_caches_gc.GetValue() && total_used_memory >= EXPECTED_MEMORY) {
if (total_used_memory >= EXPECTED_MEMORY) {
RunGarbageCollector();
}
++frame_tick;
@@ -954,7 +955,7 @@ bool BufferCache<P>::IsRegionCpuModified(VAddr addr, size_t size) {
template <class P>
void BufferCache<P>::BindHostIndexBuffer() {
Buffer& buffer = slot_buffers[index_buffer.buffer_id];
TouchBuffer(buffer);
TouchBuffer(buffer, index_buffer.buffer_id);
const u32 offset = buffer.Offset(index_buffer.cpu_addr);
const u32 size = index_buffer.size;
SynchronizeBuffer(buffer, index_buffer.cpu_addr, size);
@@ -975,7 +976,7 @@ void BufferCache<P>::BindHostVertexBuffers() {
for (u32 index = 0; index < NUM_VERTEX_BUFFERS; ++index) {
const Binding& binding = vertex_buffers[index];
Buffer& buffer = slot_buffers[binding.buffer_id];
TouchBuffer(buffer);
TouchBuffer(buffer, binding.buffer_id);
SynchronizeBuffer(buffer, binding.cpu_addr, binding.size);
if (!flags[Dirty::VertexBuffer0 + index]) {
continue;
@@ -1011,7 +1012,7 @@ void BufferCache<P>::BindHostGraphicsUniformBuffer(size_t stage, u32 index, u32
const VAddr cpu_addr = binding.cpu_addr;
const u32 size = std::min(binding.size, (*uniform_buffer_sizes)[stage][index]);
Buffer& buffer = slot_buffers[binding.buffer_id];
TouchBuffer(buffer);
TouchBuffer(buffer, binding.buffer_id);
const bool use_fast_buffer = binding.buffer_id != NULL_BUFFER_ID &&
size <= uniform_buffer_skip_cache_size &&
!buffer.IsRegionGpuModified(cpu_addr, size);
@@ -1083,7 +1084,7 @@ void BufferCache<P>::BindHostGraphicsStorageBuffers(size_t stage) {
ForEachEnabledBit(enabled_storage_buffers[stage], [&](u32 index) {
const Binding& binding = storage_buffers[stage][index];
Buffer& buffer = slot_buffers[binding.buffer_id];
TouchBuffer(buffer);
TouchBuffer(buffer, binding.buffer_id);
const u32 size = binding.size;
SynchronizeBuffer(buffer, binding.cpu_addr, size);
@@ -1128,7 +1129,7 @@ void BufferCache<P>::BindHostTransformFeedbackBuffers() {
for (u32 index = 0; index < NUM_TRANSFORM_FEEDBACK_BUFFERS; ++index) {
const Binding& binding = transform_feedback_buffers[index];
Buffer& buffer = slot_buffers[binding.buffer_id];
TouchBuffer(buffer);
TouchBuffer(buffer, binding.buffer_id);
const u32 size = binding.size;
SynchronizeBuffer(buffer, binding.cpu_addr, size);
@@ -1148,7 +1149,7 @@ void BufferCache<P>::BindHostComputeUniformBuffers() {
ForEachEnabledBit(enabled_compute_uniform_buffer_mask, [&](u32 index) {
const Binding& binding = compute_uniform_buffers[index];
Buffer& buffer = slot_buffers[binding.buffer_id];
TouchBuffer(buffer);
TouchBuffer(buffer, binding.buffer_id);
const u32 size = std::min(binding.size, (*compute_uniform_buffer_sizes)[index]);
SynchronizeBuffer(buffer, binding.cpu_addr, size);
@@ -1168,7 +1169,7 @@ void BufferCache<P>::BindHostComputeStorageBuffers() {
ForEachEnabledBit(enabled_compute_storage_buffers, [&](u32 index) {
const Binding& binding = compute_storage_buffers[index];
Buffer& buffer = slot_buffers[binding.buffer_id];
TouchBuffer(buffer);
TouchBuffer(buffer, binding.buffer_id);
const u32 size = binding.size;
SynchronizeBuffer(buffer, binding.cpu_addr, size);
@@ -1513,11 +1514,11 @@ BufferId BufferCache<P>::CreateBuffer(VAddr cpu_addr, u32 wanted_size) {
const OverlapResult overlap = ResolveOverlaps(cpu_addr, wanted_size);
const u32 size = static_cast<u32>(overlap.end - overlap.begin);
const BufferId new_buffer_id = slot_buffers.insert(runtime, rasterizer, overlap.begin, size);
TouchBuffer(slot_buffers[new_buffer_id]);
for (const BufferId overlap_id : overlap.ids) {
JoinOverlap(new_buffer_id, overlap_id, !overlap.has_stream_leap);
}
Register(new_buffer_id);
TouchBuffer(slot_buffers[new_buffer_id], new_buffer_id);
return new_buffer_id;
}
@@ -1534,12 +1535,14 @@ void BufferCache<P>::Unregister(BufferId buffer_id) {
template <class P>
template <bool insert>
void BufferCache<P>::ChangeRegister(BufferId buffer_id) {
const Buffer& buffer = slot_buffers[buffer_id];
Buffer& buffer = slot_buffers[buffer_id];
const auto size = buffer.SizeBytes();
if (insert) {
total_used_memory += Common::AlignUp(size, 1024);
buffer.setLRUID(lru_cache.Insert(buffer_id, frame_tick));
} else {
total_used_memory -= Common::AlignUp(size, 1024);
lru_cache.Free(buffer.getLRUID());
}
const VAddr cpu_addr_begin = buffer.CpuAddr();
const VAddr cpu_addr_end = cpu_addr_begin + size;
@@ -1555,8 +1558,10 @@ void BufferCache<P>::ChangeRegister(BufferId buffer_id) {
}
template <class P>
void BufferCache<P>::TouchBuffer(Buffer& buffer) const noexcept {
buffer.SetFrameTick(frame_tick);
void BufferCache<P>::TouchBuffer(Buffer& buffer, BufferId buffer_id) noexcept {
if (buffer_id != NULL_BUFFER_ID) {
lru_cache.Touch(buffer.getLRUID(), frame_tick);
}
}
template <class P>

View File

@@ -742,6 +742,7 @@ VpxBitStreamWriter VP9::ComposeUncompressedHeader() {
uncomp_writer.WriteDeltaQ(current_frame_info.uv_dc_delta_q);
uncomp_writer.WriteDeltaQ(current_frame_info.uv_ac_delta_q);
ASSERT(!current_frame_info.segment_enabled);
uncomp_writer.WriteBit(false); // Segmentation enabled (TODO).
const s32 min_tile_cols_log2 = CalcMinLog2TileCols(current_frame_info.frame_size.width);

View File

@@ -22,7 +22,7 @@ struct Vp9FrameDimensions {
};
static_assert(sizeof(Vp9FrameDimensions) == 0x8, "Vp9 Vp9FrameDimensions is an invalid size");
enum FrameFlags : u32 {
enum class FrameFlags : u32 {
IsKeyFrame = 1 << 0,
LastFrameIsKeyFrame = 1 << 1,
FrameSizeChanged = 1 << 2,
@@ -30,6 +30,7 @@ enum FrameFlags : u32 {
LastShowFrame = 1 << 4,
IntraOnly = 1 << 5,
};
DECLARE_ENUM_FLAG_OPERATORS(FrameFlags)
enum class TxSize {
Tx4x4 = 0, // 4x4 transform
@@ -92,44 +93,34 @@ struct Vp9EntropyProbs {
static_assert(sizeof(Vp9EntropyProbs) == 0x7B4, "Vp9EntropyProbs is an invalid size");
struct Vp9PictureInfo {
bool is_key_frame;
bool intra_only;
bool last_frame_was_key;
bool frame_size_changed;
bool error_resilient_mode;
bool last_frame_shown;
bool show_frame;
u32 bitstream_size;
std::array<u64, 4> frame_offsets;
std::array<s8, 4> ref_frame_sign_bias;
s32 base_q_index;
s32 y_dc_delta_q;
s32 uv_dc_delta_q;
s32 uv_ac_delta_q;
bool lossless;
s32 transform_mode;
bool allow_high_precision_mv;
s32 interp_filter;
s32 reference_mode;
s8 comp_fixed_ref;
std::array<s8, 2> comp_var_ref;
s32 log2_tile_cols;
s32 log2_tile_rows;
bool segment_enabled;
bool segment_map_update;
bool segment_map_temporal_update;
s32 segment_abs_delta;
std::array<u32, 8> segment_feature_enable;
std::array<std::array<s16, 4>, 8> segment_feature_data;
bool mode_ref_delta_enabled;
bool use_prev_in_find_mv_refs;
std::array<s8, 4> ref_deltas;
std::array<s8, 2> mode_deltas;
Vp9EntropyProbs entropy;
Vp9FrameDimensions frame_size;
u8 first_level;
u8 sharpness_level;
u32 bitstream_size;
std::array<u64, 4> frame_offsets;
std::array<bool, 4> refresh_frame;
bool is_key_frame;
bool intra_only;
bool last_frame_was_key;
bool error_resilient_mode;
bool last_frame_shown;
bool show_frame;
bool lossless;
bool allow_high_precision_mv;
bool segment_enabled;
bool mode_ref_delta_enabled;
};
struct Vp9FrameContainer {
@@ -145,7 +136,7 @@ struct PictureInfo {
Vp9FrameDimensions golden_frame_size; ///< 0x50
Vp9FrameDimensions alt_frame_size; ///< 0x58
Vp9FrameDimensions current_frame_size; ///< 0x60
u32 vp9_flags; ///< 0x68
FrameFlags vp9_flags; ///< 0x68
std::array<s8, 4> ref_frame_sign_bias; ///< 0x6C
u8 first_level; ///< 0x70
u8 sharpness_level; ///< 0x71
@@ -158,60 +149,43 @@ struct PictureInfo {
u8 allow_high_precision_mv; ///< 0x78
u8 interp_filter; ///< 0x79
u8 reference_mode; ///< 0x7A
s8 comp_fixed_ref; ///< 0x7B
std::array<s8, 2> comp_var_ref; ///< 0x7C
INSERT_PADDING_BYTES_NOINIT(3); ///< 0x7B
u8 log2_tile_cols; ///< 0x7E
u8 log2_tile_rows; ///< 0x7F
Segmentation segmentation; ///< 0x80
LoopFilter loop_filter; ///< 0xE4
INSERT_PADDING_BYTES_NOINIT(5); ///< 0xEB
u32 surface_params; ///< 0xF0
INSERT_PADDING_WORDS_NOINIT(3); ///< 0xF4
INSERT_PADDING_BYTES_NOINIT(21); ///< 0xEB
[[nodiscard]] Vp9PictureInfo Convert() const {
return {
.is_key_frame = (vp9_flags & FrameFlags::IsKeyFrame) != 0,
.intra_only = (vp9_flags & FrameFlags::IntraOnly) != 0,
.last_frame_was_key = (vp9_flags & FrameFlags::LastFrameIsKeyFrame) != 0,
.frame_size_changed = (vp9_flags & FrameFlags::FrameSizeChanged) != 0,
.error_resilient_mode = (vp9_flags & FrameFlags::ErrorResilientMode) != 0,
.last_frame_shown = (vp9_flags & FrameFlags::LastShowFrame) != 0,
.show_frame = true,
.bitstream_size = bitstream_size,
.frame_offsets{},
.ref_frame_sign_bias = ref_frame_sign_bias,
.base_q_index = base_q_index,
.y_dc_delta_q = y_dc_delta_q,
.uv_dc_delta_q = uv_dc_delta_q,
.uv_ac_delta_q = uv_ac_delta_q,
.lossless = lossless != 0,
.transform_mode = tx_mode,
.allow_high_precision_mv = allow_high_precision_mv != 0,
.interp_filter = interp_filter,
.reference_mode = reference_mode,
.comp_fixed_ref = comp_fixed_ref,
.comp_var_ref = comp_var_ref,
.log2_tile_cols = log2_tile_cols,
.log2_tile_rows = log2_tile_rows,
.segment_enabled = segmentation.enabled != 0,
.segment_map_update = segmentation.update_map != 0,
.segment_map_temporal_update = segmentation.temporal_update != 0,
.segment_abs_delta = segmentation.abs_delta,
.segment_feature_enable = segmentation.feature_mask,
.segment_feature_data = segmentation.feature_data,
.mode_ref_delta_enabled = loop_filter.mode_ref_delta_enabled != 0,
.use_prev_in_find_mv_refs = !(vp9_flags == (FrameFlags::ErrorResilientMode)) &&
!(vp9_flags == (FrameFlags::FrameSizeChanged)) &&
!(vp9_flags == (FrameFlags::IntraOnly)) &&
(vp9_flags == (FrameFlags::LastShowFrame)) &&
!(vp9_flags == (FrameFlags::LastFrameIsKeyFrame)),
.ref_deltas = loop_filter.ref_deltas,
.mode_deltas = loop_filter.mode_deltas,
.entropy{},
.frame_size = current_frame_size,
.first_level = first_level,
.sharpness_level = sharpness_level,
.bitstream_size = bitstream_size,
.frame_offsets{},
.refresh_frame{},
.is_key_frame = True(vp9_flags & FrameFlags::IsKeyFrame),
.intra_only = True(vp9_flags & FrameFlags::IntraOnly),
.last_frame_was_key = True(vp9_flags & FrameFlags::LastFrameIsKeyFrame),
.error_resilient_mode = True(vp9_flags & FrameFlags::ErrorResilientMode),
.last_frame_shown = True(vp9_flags & FrameFlags::LastShowFrame),
.show_frame = true,
.lossless = lossless != 0,
.allow_high_precision_mv = allow_high_precision_mv != 0,
.segment_enabled = segmentation.enabled != 0,
.mode_ref_delta_enabled = loop_filter.mode_ref_delta_enabled != 0,
};
}
};
@@ -316,7 +290,6 @@ ASSERT_POSITION(last_frame_size, 0x48);
ASSERT_POSITION(first_level, 0x70);
ASSERT_POSITION(segmentation, 0x80);
ASSERT_POSITION(loop_filter, 0xE4);
ASSERT_POSITION(surface_params, 0xF0);
#undef ASSERT_POSITION
#define ASSERT_POSITION(field_name, position) \

View File

@@ -475,10 +475,10 @@ public:
// These values are used by Nouveau and some games.
AddGL = 0x8006,
SubtractGL = 0x8007,
ReverseSubtractGL = 0x8008,
MinGL = 0x800a,
MaxGL = 0x800b
MinGL = 0x8007,
MaxGL = 0x8008,
SubtractGL = 0x800a,
ReverseSubtractGL = 0x800b
};
enum class Factor : u32 {

View File

@@ -156,6 +156,10 @@ public:
return shader_backend;
}
bool IsAmd() const {
return vendor_name == "ATI Technologies Inc.";
}
private:
static bool TestVariableAoffi();
static bool TestPreciseBug();

View File

@@ -219,6 +219,7 @@ ShaderCache::ShaderCache(RasterizerOpenGL& rasterizer_, Core::Frontend::EmuWindo
host_info{
.support_float16 = false,
.support_int64 = device.HasShaderInt64(),
.needs_demote_reorder = device.IsAmd(),
} {
if (use_asynchronous_shaders) {
workers = CreateWorkers();

View File

@@ -159,11 +159,13 @@ VkSemaphore VKBlitScreen::Draw(const Tegra::FramebufferConfig& framebuffer,
const VAddr framebuffer_addr = framebuffer.address + framebuffer.offset;
const u8* const host_ptr = cpu_memory.GetPointer(framebuffer_addr);
const size_t size_bytes = GetSizeInBytes(framebuffer);
// TODO(Rodrigo): Read this from HLE
constexpr u32 block_height_log2 = 4;
const u32 bytes_per_pixel = GetBytesPerPixel(framebuffer);
const u64 size_bytes{Tegra::Texture::CalculateSize(true, bytes_per_pixel,
framebuffer.stride, framebuffer.height,
1, block_height_log2, 0)};
Tegra::Texture::UnswizzleTexture(
mapped_span.subspan(image_offset, size_bytes), std::span(host_ptr, size_bytes),
bytes_per_pixel, framebuffer.width, framebuffer.height, 1, block_height_log2, 0);

View File

@@ -325,6 +325,8 @@ PipelineCache::PipelineCache(RasterizerVulkan& rasterizer_, Tegra::Engines::Maxw
host_info = Shader::HostTranslateInfo{
.support_float16 = device.IsFloat16Supported(),
.support_int64 = device.IsShaderInt64Supported(),
.needs_demote_reorder = driver_id == VK_DRIVER_ID_AMD_PROPRIETARY_KHR ||
driver_id == VK_DRIVER_ID_AMD_OPEN_SOURCE_KHR,
};
}

View File

@@ -80,7 +80,7 @@ struct ImageBase {
VAddr cpu_addr_end = 0;
u64 modification_tick = 0;
u64 frame_tick = 0;
size_t lru_index = SIZE_MAX;
std::array<u32, MAX_MIP_LEVELS> mip_level_offsets{};

View File

@@ -5,7 +5,6 @@
#pragma once
#include "common/alignment.h"
#include "common/settings.h"
#include "video_core/dirty_flags.h"
#include "video_core/texture_cache/samples_helper.h"
#include "video_core/texture_cache/texture_cache_base.h"
@@ -43,8 +42,6 @@ TextureCache<P>::TextureCache(Runtime& runtime_, VideoCore::RasterizerInterface&
void(slot_image_views.insert(runtime, NullImageParams{}));
void(slot_samplers.insert(runtime, sampler_descriptor));
deletion_iterator = slot_images.begin();
if constexpr (HAS_DEVICE_MEMORY_INFO) {
const auto device_memory = runtime.GetDeviceLocalMemory();
const u64 possible_expected_memory = (device_memory * 3) / 10;
@@ -64,70 +61,38 @@ template <class P>
void TextureCache<P>::RunGarbageCollector() {
const bool high_priority_mode = total_used_memory >= expected_memory;
const bool aggressive_mode = total_used_memory >= critical_memory;
const u64 ticks_to_destroy = high_priority_mode ? 60 : 100;
int num_iterations = aggressive_mode ? 256 : (high_priority_mode ? 128 : 64);
for (; num_iterations > 0; --num_iterations) {
if (deletion_iterator == slot_images.end()) {
deletion_iterator = slot_images.begin();
if (deletion_iterator == slot_images.end()) {
break;
}
const u64 ticks_to_destroy = aggressive_mode ? 10ULL : high_priority_mode ? 25ULL : 100ULL;
size_t num_iterations = aggressive_mode ? 10000 : (high_priority_mode ? 100 : 5);
const auto clean_up = [this, &num_iterations, high_priority_mode](ImageId image_id) {
if (num_iterations == 0) {
return true;
}
auto [image_id, image_tmp] = *deletion_iterator;
Image* image = image_tmp; // fix clang error.
const bool is_alias = True(image->flags & ImageFlagBits::Alias);
const bool is_bad_overlap = True(image->flags & ImageFlagBits::BadOverlap);
const bool must_download = image->IsSafeDownload();
bool should_care = is_bad_overlap || is_alias || (high_priority_mode && !must_download);
const u64 ticks_needed =
is_bad_overlap
? ticks_to_destroy >> 4
: ((should_care && aggressive_mode) ? ticks_to_destroy >> 1 : ticks_to_destroy);
should_care |= aggressive_mode;
if (should_care && image->frame_tick + ticks_needed < frame_tick) {
if (is_bad_overlap) {
const bool overlap_check = std::ranges::all_of(
image->overlapping_images, [&, image](const ImageId& overlap_id) {
auto& overlap = slot_images[overlap_id];
return overlap.frame_tick >= image->frame_tick;
});
if (!overlap_check) {
++deletion_iterator;
continue;
}
}
if (!is_bad_overlap && must_download) {
const bool alias_check = std::ranges::none_of(
image->aliased_images, [&, image](const AliasedImage& alias) {
auto& alias_image = slot_images[alias.id];
return (alias_image.frame_tick < image->frame_tick) ||
(alias_image.modification_tick < image->modification_tick);
});
if (alias_check) {
auto map = runtime.DownloadStagingBuffer(image->unswizzled_size_bytes);
const auto copies = FullDownloadCopies(image->info);
image->DownloadMemory(map, copies);
runtime.Finish();
SwizzleImage(gpu_memory, image->gpu_addr, image->info, copies, map.mapped_span);
}
}
if (True(image->flags & ImageFlagBits::Tracked)) {
UntrackImage(*image, image_id);
}
UnregisterImage(image_id);
DeleteImage(image_id);
if (is_bad_overlap) {
++num_iterations;
}
--num_iterations;
auto& image = slot_images[image_id];
const bool must_download = image.IsSafeDownload();
if (!high_priority_mode && must_download) {
return false;
}
++deletion_iterator;
}
if (must_download) {
auto map = runtime.DownloadStagingBuffer(image.unswizzled_size_bytes);
const auto copies = FullDownloadCopies(image.info);
image.DownloadMemory(map, copies);
runtime.Finish();
SwizzleImage(gpu_memory, image.gpu_addr, image.info, copies, map.mapped_span);
}
if (True(image.flags & ImageFlagBits::Tracked)) {
UntrackImage(image, image_id);
}
UnregisterImage(image_id);
DeleteImage(image_id);
return false;
};
lru_cache.ForEachItemBelow(frame_tick - ticks_to_destroy, clean_up);
}
template <class P>
void TextureCache<P>::TickFrame() {
if (Settings::values.use_caches_gc.GetValue() && total_used_memory > minimum_memory) {
if (total_used_memory > minimum_memory) {
RunGarbageCollector();
}
sentenced_images.Tick();
@@ -1078,6 +1043,8 @@ void TextureCache<P>::RegisterImage(ImageId image_id) {
tentative_size = EstimatedDecompressedSize(tentative_size, image.info.format);
}
total_used_memory += Common::AlignUp(tentative_size, 1024);
image.lru_index = lru_cache.Insert(image_id, frame_tick);
ForEachGPUPage(image.gpu_addr, image.guest_size_bytes,
[this, image_id](u64 page) { gpu_page_table[page].push_back(image_id); });
if (False(image.flags & ImageFlagBits::Sparse)) {
@@ -1115,6 +1082,7 @@ void TextureCache<P>::UnregisterImage(ImageId image_id) {
tentative_size = EstimatedDecompressedSize(tentative_size, image.info.format);
}
total_used_memory -= Common::AlignUp(tentative_size, 1024);
lru_cache.Free(image.lru_index);
const auto& clear_page_table =
[this, image_id](
u64 page,
@@ -1384,7 +1352,7 @@ void TextureCache<P>::PrepareImage(ImageId image_id, bool is_modification, bool
if (is_modification) {
MarkModification(image);
}
image.frame_tick = frame_tick;
lru_cache.Touch(image.lru_index, frame_tick);
}
template <class P>

View File

@@ -14,6 +14,7 @@
#include "common/common_types.h"
#include "common/literals.h"
#include "common/lru_cache.h"
#include "video_core/compatible_formats.h"
#include "video_core/delayed_destruction_ring.h"
#include "video_core/engines/fermi_2d.h"
@@ -370,6 +371,12 @@ private:
std::vector<ImageId> uncommitted_downloads;
std::queue<std::vector<ImageId>> committed_downloads;
struct LRUItemParams {
using ObjectType = ImageId;
using TickType = u64;
};
Common::LeastRecentlyUsedCache<LRUItemParams> lru_cache;
static constexpr size_t TICKS_TO_DESTROY = 6;
DelayedDestructionRing<Image, TICKS_TO_DESTROY> sentenced_images;
DelayedDestructionRing<ImageView, TICKS_TO_DESTROY> sentenced_image_view;
@@ -379,7 +386,6 @@ private:
u64 modification_tick = 0;
u64 frame_tick = 0;
typename SlotVector<Image>::Iterator deletion_iterator;
};
} // namespace VideoCommon

View File

@@ -63,14 +63,6 @@ void SwizzleImpl(std::span<u8> output, std::span<const u8> input, u32 width, u32
const u32 unswizzled_offset =
slice * pitch * height + line * pitch + column * BYTES_PER_PIXEL;
if (const auto offset = (TO_LINEAR ? unswizzled_offset : swizzled_offset);
offset >= input.size()) {
// TODO(Rodrigo): This is an out of bounds access that should never happen. To
// avoid crashing the emulator, break.
ASSERT_MSG(false, "offset {} exceeds input size {}!", offset, input.size());
break;
}
u8* const dst = &output[TO_LINEAR ? swizzled_offset : unswizzled_offset];
const u8* const src = &input[TO_LINEAR ? unswizzled_offset : swizzled_offset];

View File

@@ -818,7 +818,6 @@ void Config::ReadRendererValues() {
ReadGlobalSetting(Settings::values.shader_backend);
ReadGlobalSetting(Settings::values.use_asynchronous_shaders);
ReadGlobalSetting(Settings::values.use_fast_gpu_time);
ReadGlobalSetting(Settings::values.use_caches_gc);
ReadGlobalSetting(Settings::values.bg_red);
ReadGlobalSetting(Settings::values.bg_green);
ReadGlobalSetting(Settings::values.bg_blue);
@@ -1359,7 +1358,6 @@ void Config::SaveRendererValues() {
Settings::values.shader_backend.UsingGlobal());
WriteGlobalSetting(Settings::values.use_asynchronous_shaders);
WriteGlobalSetting(Settings::values.use_fast_gpu_time);
WriteGlobalSetting(Settings::values.use_caches_gc);
WriteGlobalSetting(Settings::values.bg_red);
WriteGlobalSetting(Settings::values.bg_green);
WriteGlobalSetting(Settings::values.bg_blue);

View File

@@ -156,7 +156,7 @@
<item>
<widget class="QCheckBox" name="use_disk_shader_cache">
<property name="text">
<string>Use disk shader cache</string>
<string>Use disk pipeline cache</string>
</property>
</widget>
</item>

View File

@@ -28,7 +28,6 @@ void ConfigureGraphicsAdvanced::SetConfiguration() {
ui->use_vsync->setChecked(Settings::values.use_vsync.GetValue());
ui->use_asynchronous_shaders->setChecked(Settings::values.use_asynchronous_shaders.GetValue());
ui->use_caches_gc->setChecked(Settings::values.use_caches_gc.GetValue());
ui->use_fast_gpu_time->setChecked(Settings::values.use_fast_gpu_time.GetValue());
if (Settings::IsConfiguringGlobal()) {
@@ -55,8 +54,6 @@ void ConfigureGraphicsAdvanced::ApplyConfiguration() {
ConfigurationShared::ApplyPerGameSetting(&Settings::values.use_asynchronous_shaders,
ui->use_asynchronous_shaders,
use_asynchronous_shaders);
ConfigurationShared::ApplyPerGameSetting(&Settings::values.use_caches_gc, ui->use_caches_gc,
use_caches_gc);
ConfigurationShared::ApplyPerGameSetting(&Settings::values.use_fast_gpu_time,
ui->use_fast_gpu_time, use_fast_gpu_time);
}
@@ -81,7 +78,6 @@ void ConfigureGraphicsAdvanced::SetupPerGameUI() {
ui->use_asynchronous_shaders->setEnabled(
Settings::values.use_asynchronous_shaders.UsingGlobal());
ui->use_fast_gpu_time->setEnabled(Settings::values.use_fast_gpu_time.UsingGlobal());
ui->use_caches_gc->setEnabled(Settings::values.use_caches_gc.UsingGlobal());
ui->anisotropic_filtering_combobox->setEnabled(
Settings::values.max_anisotropy.UsingGlobal());
@@ -94,8 +90,6 @@ void ConfigureGraphicsAdvanced::SetupPerGameUI() {
use_asynchronous_shaders);
ConfigurationShared::SetColoredTristate(ui->use_fast_gpu_time,
Settings::values.use_fast_gpu_time, use_fast_gpu_time);
ConfigurationShared::SetColoredTristate(ui->use_caches_gc, Settings::values.use_caches_gc,
use_caches_gc);
ConfigurationShared::SetColoredComboBox(
ui->gpu_accuracy, ui->label_gpu_accuracy,
static_cast<int>(Settings::values.gpu_accuracy.GetValue(true)));

View File

@@ -37,5 +37,4 @@ private:
ConfigurationShared::CheckState use_vsync;
ConfigurationShared::CheckState use_asynchronous_shaders;
ConfigurationShared::CheckState use_fast_gpu_time;
ConfigurationShared::CheckState use_caches_gc;
};

View File

@@ -82,7 +82,7 @@
<string>Enables asynchronous shader compilation, which may reduce shader stutter. This feature is experimental.</string>
</property>
<property name="text">
<string>Use asynchronous shader building (hack)</string>
<string>Use asynchronous shader building (Hack)</string>
</property>
</widget>
</item>
@@ -92,17 +92,7 @@
<string>Enables Fast GPU Time. This option will force most games to run at their highest native resolution.</string>
</property>
<property name="text">
<string>Use Fast GPU Time (hack)</string>
</property>
</widget>
</item>
<item>
<widget class="QCheckBox" name="use_caches_gc">
<property name="toolTip">
<string>Enables garbage collection for the GPU caches, this will try to keep VRAM within 3-4 GB by flushing the least used textures/buffers. May cause issues in a few games.</string>
</property>
<property name="text">
<string>Enable GPU cache garbage collection (experimental)</string>
<string>Use Fast GPU Time (Hack)</string>
</property>
</widget>
</item>

View File

@@ -515,16 +515,16 @@ void GameList::AddGamePopup(QMenu& context_menu, u64 program_id, const std::stri
QAction* open_save_location = context_menu.addAction(tr("Open Save Data Location"));
QAction* open_mod_location = context_menu.addAction(tr("Open Mod Data Location"));
QAction* open_transferable_shader_cache =
context_menu.addAction(tr("Open Transferable Shader Cache"));
context_menu.addAction(tr("Open Transferable Pipeline Cache"));
context_menu.addSeparator();
QMenu* remove_menu = context_menu.addMenu(tr("Remove"));
QAction* remove_update = remove_menu->addAction(tr("Remove Installed Update"));
QAction* remove_dlc = remove_menu->addAction(tr("Remove All Installed DLC"));
QAction* remove_custom_config = remove_menu->addAction(tr("Remove Custom Configuration"));
QAction* remove_gl_shader_cache = remove_menu->addAction(tr("Remove OpenGL Shader Cache"));
QAction* remove_vk_shader_cache = remove_menu->addAction(tr("Remove Vulkan Shader Cache"));
QAction* remove_gl_shader_cache = remove_menu->addAction(tr("Remove OpenGL Pipeline Cache"));
QAction* remove_vk_shader_cache = remove_menu->addAction(tr("Remove Vulkan Pipeline Cache"));
remove_menu->addSeparator();
QAction* remove_shader_cache = remove_menu->addAction(tr("Remove All Shader Caches"));
QAction* remove_shader_cache = remove_menu->addAction(tr("Remove All Pipeline Caches"));
QAction* remove_all_content = remove_menu->addAction(tr("Remove All Installed Contents"));
QMenu* dump_romfs_menu = context_menu.addMenu(tr("Dump RomFS"));
QAction* dump_romfs = dump_romfs_menu->addAction(tr("Dump RomFS"));

View File

@@ -468,7 +468,6 @@ void Config::ReadValues() {
ReadSetting("Renderer", Settings::values.use_nvdec_emulation);
ReadSetting("Renderer", Settings::values.accelerate_astc);
ReadSetting("Renderer", Settings::values.use_fast_gpu_time);
ReadSetting("Renderer", Settings::values.use_caches_gc);
ReadSetting("Renderer", Settings::values.bg_red);
ReadSetting("Renderer", Settings::values.bg_green);