Compare commits

..

49 Commits

Author SHA1 Message Date
Lioncash
0722adb471 externals: Update dynarmic to dfdec79 2018-07-15 16:11:45 -04:00
bunnei
068668780c Merge pull request #668 from jroweboy/controller-lag
HID: Update controllers less often
2018-07-15 13:06:28 -07:00
bunnei
04b9cde4f5 Merge pull request #664 from jroweboy/logging-stuff
Minor logging improvements
2018-07-15 12:58:52 -07:00
James Rowe
7d209b3c9f HID: Update controllers less often 2018-07-15 13:47:41 -06:00
James Rowe
497b81558e Logging: Dump all logs in the queue on close in debug mode 2018-07-15 13:02:09 -06:00
bunnei
98762e9601 Merge pull request #666 from bunnei/g8r8
gl_rasterizer_cache: Implement texture format G8R8.
2018-07-15 10:09:36 -07:00
bunnei
3a96670f2d gl_rasterizer_cache: Implement texture format G8R8. 2018-07-15 01:33:42 -04:00
bunnei
aaec0b7e70 Merge pull request #665 from bunnei/fix-z24-s8
gl_rasterizer_cache: Fix incorrect offset in ConvertS8Z24ToZ24S8.
2018-07-14 22:18:55 -07:00
bunnei
f8ab956189 Merge pull request #659 from bunnei/depth16
gl_rasterizer_cache: Implement depth format Z16_UNORM.
2018-07-14 21:39:23 -07:00
bunnei
3145114190 gl_rasterizer_cache: Fix incorrect offset in ConvertS8Z24ToZ24S8. 2018-07-15 00:02:05 -04:00
bunnei
e21190f47f gl_rasterizer_cache: Implement depth format Z16_UNORM. 2018-07-14 23:43:28 -04:00
bunnei
2cb3fdca86 Merge pull request #598 from bunnei/makedonecurrent
OpenGL: Use MakeCurrent/DoneCurrent for multithreaded rendering.
2018-07-14 20:18:11 -07:00
bunnei
c324a378ac Merge pull request #663 from Subv/bsd
Services/BSD: Corrected the return for StartMonitoring according to SwIPC
2018-07-14 19:40:34 -07:00
bunnei
fd1f5c5414 Merge pull request #662 from Subv/delete_file
FileSys: Append the requested path to the filesystem base path in DeleteFile
2018-07-14 13:11:58 -07:00
James Rowe
6daebaaa57 Logging: Don't lock the queue for the duration of the write 2018-07-14 11:57:13 -06:00
Subv
b07f4d6afb Services/BSD: Corrected the return for StartMonitoring according to SwIPC. 2018-07-14 12:34:07 -05:00
bunnei
ad0166a982 Merge pull request #661 from ogniK5377/assert-nit
No need to use ASSERT_MSG with an empty assert message
2018-07-14 09:12:21 -07:00
Subv
7e5e4f8d7a FileSys: Append the requested path to the filesystem base path in DeleteFile.
We were trying to delete things in the current directory instead of the actual filesystem directory. This may fix some savedata issues in some games.
2018-07-14 10:57:22 -05:00
David Marcec
a7d6c0d6ea No need to use ASSERT_MSG with an empty message 2018-07-14 23:13:16 +10:00
bunnei
81739a5448 Merge pull request #660 from Subv/depth_write
GPU: Always enable the depth write when clearing the depth buffer.
2018-07-14 00:38:12 -07:00
bunnei
05cb10530f OpenGL: Use MakeCurrent/DoneCurrent for multithreaded rendering. 2018-07-14 02:50:35 -04:00
Subv
b37354cca8 GPU: Always enable the depth write when clearing the depth buffer.
The GPU ignores that register when clearing, but OpenGL obeys the glDepthMask parameter, so we set the depth mask to GL_TRUE when clearing the depth buffer. It will be restored to the correct value automatically on the next draw call.
2018-07-14 00:52:23 -05:00
bunnei
9fc0d1d701 Merge pull request #657 from bunnei/dual-vs
gl_shader_gen: Implement dual vertex shader mode.
2018-07-13 07:08:54 -07:00
Hedges
e066bc75b9 More improvements to GDBStub (#653)
* More improvements to GDBStub
- Debugging of threads should work correctly with source and assembly level stepping and modifying registers and memory, meaning threads and callstacks are fully clickable in VS.
- List of modules is available to the client, with assumption that .nro and .nso are backed up by an .elf with symbols, while deconstructed ROMs keep N names.
- Initial support for floating point registers.

* Tidy up as requested in PR feedback

* Tidy up as requested in PR feedback
2018-07-12 20:22:59 -07:00
bunnei
8aeff9cf8e gl_rasterizer: Fix check for if a shader stage is enabled. 2018-07-12 22:57:57 -04:00
bunnei
c4015cd93a gl_shader_gen: Implement dual vertex shader mode.
- When VertexA shader stage is enabled, we combine with VertexB program to make a single Vertex Shader stage.
2018-07-12 22:25:36 -04:00
bunnei
ce23ae3ede Merge pull request #656 from ogniK5377/audren-mem-init
Initialized memory for RequestUpdateAudioRenderer and fixed MemoryPoolSection to be more accurate
2018-07-12 19:12:47 -07:00
bunnei
64b5e5d5d9 Merge pull request #655 from bunnei/pred-lt-nan
gl_shader_decompiler: Implement PredCondition::LessThanWithNan.
2018-07-12 18:59:15 -07:00
bunnei
52636f67cc Merge pull request #654 from bunnei/cond-exit
gl_shader_decompiler: Use FlowCondition field in EXIT instruction.
2018-07-12 18:59:05 -07:00
David Marcec
8bd8d1e3da We only need to alert for memory pool changes 2018-07-13 10:36:28 +10:00
David Marcec
6642011706 initialized voice status and unused sizes in the update data header 2018-07-13 10:35:44 +10:00
bunnei
49c0c081c4 gl_shader_decompiler: Implement PredCondition::LessThanWithNan. 2018-07-12 20:04:35 -04:00
bunnei
4757ffdcce gl_shader_decompiler: Use FlowCondition field in EXIT instruction. 2018-07-12 20:00:37 -04:00
Sebastian Valle
274d1fb0fc Merge pull request #652 from Subv/fadd32i
GPU: Implement the FADD32I shader instruction.
2018-07-12 17:36:51 -05:00
bunnei
3ff21345b4 Merge pull request #651 from Subv/ffma_decode
GPU: Corrected the decoding of FFMA for immediate operands.
2018-07-12 12:42:58 -07:00
Subv
c1ae841f47 GPU: Implement the FADD32I shader instruction. 2018-07-12 12:00:31 -05:00
Tobias
316b933a31 Port #3335 and #3373 from Citra: "Small SDL fixes" and "Print the actual error preventing SDL from working" (#637)
* Port #3335 and #3373 from Citra

* Fixup: Use the new logging placeholders
2018-07-12 09:26:27 -07:00
Subv
0cad310e12 GPU: Corrected the decoding of FFMA for immediate operands. 2018-07-12 10:15:48 -05:00
bunnei
4f41ffdd41 Merge pull request #648 from ogniK5377/no-net
Let games/application know that we're offline
2018-07-12 06:44:15 -07:00
bunnei
7c7b2b8285 Merge pull request #649 from ogniK5377/audout-auto
Audout "Auto" functions
2018-07-12 06:43:37 -07:00
bunnei
b89fb430c7 Merge pull request #650 from jroweboy/logging-stuff
Minor logging fixes
2018-07-12 06:42:46 -07:00
James Rowe
b30c5370b1 yuzu - Fix duplicate logs 2018-07-12 01:11:43 -06:00
James Rowe
020d005d8c yuzu-cmd Apply the filter string from settings 2018-07-12 01:09:03 -06:00
David Marcec
706892de7d Audout "Auto" functions
Audout autos are identical to their counterpart except for the buffer type which yuzu already handles for us.
2018-07-12 16:57:31 +10:00
David Marcec
3d68f6ba6c Added IsWirelessCommunicationEnabled, IsEthernetCommunicationEnabled, IsAnyInternetRequestAccepted
Since we have no socket implementation we should be returning 0 to indicate we're currently offline.
2018-07-12 16:40:17 +10:00
bunnei
7230ceb584 Merge pull request #559 from Subv/mount_savedata
Services/FS: Return the correct error code when trying to mount a nonexistent savedata.
2018-07-11 20:21:52 -07:00
bunnei
afb26b190f Merge pull request #585 from janisozaur/patch-11
Improve directory creation in WindowsCopyFiles.cmake
2018-07-11 20:21:17 -07:00
Michał Janiszewski
23dc36ed71 Improve directory creation in WindowsCopyFiles.cmake 2018-06-24 21:27:00 +02:00
Subv
5f57a70a7d Services/FS: Return the correct error code when trying to mount a nonexistent savedata. 2018-06-18 19:26:01 -05:00
33 changed files with 416 additions and 164 deletions

View File

@@ -1,4 +1,4 @@
# Copyright 2016 Citra Emulator Project
# Copyright 2018 Yuzu Emulator Project
# Licensed under GPLv2 or any later version
# Refer to the license.txt file included.
@@ -22,7 +22,7 @@ function(windows_copy_files TARGET SOURCE_DIR DEST_DIR)
# cmake adds an extra check for command success which doesn't work too well with robocopy
# so trick it into thinking the command was successful with the || cmd /c "exit /b 0"
add_custom_command(TARGET ${TARGET} POST_BUILD
COMMAND if not exist ${DEST_DIR} mkdir ${DEST_DIR} 2> nul
COMMAND ${CMAKE_COMMAND} -E make_directory ${DEST_DIR}
COMMAND robocopy ${SOURCE_DIR} ${DEST_DIR} ${ARGN} /NJH /NJS /NDL /NFL /NC /NS /NP || cmd /c "exit /b 0"
)
endfunction()
endfunction()

View File

@@ -5,6 +5,7 @@
#include <algorithm>
#include <array>
#include <chrono>
#include <climits>
#include <condition_variable>
#include <memory>
#include <thread>
@@ -83,8 +84,10 @@ private:
}
};
while (true) {
std::unique_lock<std::mutex> lock(message_mutex);
message_cv.wait(lock, [&] { return !running || message_queue.Pop(entry); });
{
std::unique_lock<std::mutex> lock(message_mutex);
message_cv.wait(lock, [&] { return !running || message_queue.Pop(entry); });
}
if (!running) {
break;
}
@@ -92,7 +95,7 @@ private:
}
// Drain the logging queue. Only writes out up to MAX_LOGS_TO_WRITE to prevent a case
// where a system is repeatedly spamming logs even on close.
constexpr int MAX_LOGS_TO_WRITE = 100;
const int MAX_LOGS_TO_WRITE = filter.IsDebug() ? INT_MAX : 100;
int logs_written = 0;
while (logs_written++ < MAX_LOGS_TO_WRITE && message_queue.Pop(entry)) {
write_logs(entry);
@@ -282,4 +285,4 @@ void FmtLogMessageImpl(Class log_class, Level log_level, const char* filename,
Impl::Instance().PushEntry(std::move(entry));
}
} // namespace Log
} // namespace Log

View File

@@ -94,4 +94,11 @@ bool Filter::ParseFilterRule(const std::string::const_iterator begin,
bool Filter::CheckMessage(Class log_class, Level level) const {
return static_cast<u8>(level) >= static_cast<u8>(class_levels[static_cast<size_t>(log_class)]);
}
bool Filter::IsDebug() const {
return std::any_of(class_levels.begin(), class_levels.end(), [](const Level& l) {
return static_cast<u8>(l) <= static_cast<u8>(Level::Debug);
});
}
} // namespace Log

View File

@@ -47,6 +47,9 @@ public:
/// Matches class/level combination against the filter, returning true if it passed.
bool CheckMessage(Class log_class, Level level) const;
/// Returns true if any logging classes are set to debug
bool IsDebug() const;
private:
std::array<Level, (size_t)Class::Count> class_levels;
};

View File

@@ -58,11 +58,13 @@ ResultVal<std::unique_ptr<StorageBackend>> Disk_FileSystem::OpenFile(const std::
}
ResultCode Disk_FileSystem::DeleteFile(const std::string& path) const {
if (!FileUtil::Exists(path)) {
std::string full_path = base_directory + path;
if (!FileUtil::Exists(full_path)) {
return ERROR_PATH_NOT_FOUND;
}
FileUtil::Delete(path);
FileUtil::Delete(full_path);
return RESULT_SUCCESS;
}

View File

@@ -11,6 +11,7 @@ namespace FileSys {
namespace ErrCodes {
enum {
NotFound = 1,
SaveDataNotFound = 1002,
};
}

View File

@@ -214,8 +214,8 @@ ResultCode HLERequestContext::WriteToOutgoingCommandBuffer(Thread& thread) {
(sizeof(IPC::CommandHeader) + sizeof(IPC::HandleDescriptorHeader)) / sizeof(u32);
ASSERT_MSG(!handle_descriptor_header->send_current_pid, "Sending PID is not implemented");
ASSERT_MSG(copy_objects.size() == handle_descriptor_header->num_handles_to_copy);
ASSERT_MSG(move_objects.size() == handle_descriptor_header->num_handles_to_move);
ASSERT(copy_objects.size() == handle_descriptor_header->num_handles_to_copy);
ASSERT(move_objects.size() == handle_descriptor_header->num_handles_to_move);
// We don't make a distinction between copy and move handles when translating since HLE
// services don't deal with handles directly. However, the guest applications might check

View File

@@ -27,12 +27,12 @@ public:
{0, &IAudioOut::GetAudioOutState, "GetAudioOutState"},
{1, &IAudioOut::StartAudioOut, "StartAudioOut"},
{2, &IAudioOut::StopAudioOut, "StopAudioOut"},
{3, &IAudioOut::AppendAudioOutBuffer, "AppendAudioOutBuffer"},
{3, &IAudioOut::AppendAudioOutBufferImpl, "AppendAudioOutBuffer"},
{4, &IAudioOut::RegisterBufferEvent, "RegisterBufferEvent"},
{5, &IAudioOut::GetReleasedAudioOutBuffer, "GetReleasedAudioOutBuffer"},
{5, &IAudioOut::GetReleasedAudioOutBufferImpl, "GetReleasedAudioOutBuffer"},
{6, nullptr, "ContainsAudioOutBuffer"},
{7, nullptr, "AppendAudioOutBufferAuto"},
{8, nullptr, "GetReleasedAudioOutBufferAuto"},
{7, &IAudioOut::AppendAudioOutBufferImpl, "AppendAudioOutBufferAuto"},
{8, &IAudioOut::GetReleasedAudioOutBufferImpl, "GetReleasedAudioOutBufferAuto"},
{9, nullptr, "GetAudioOutBufferCount"},
{10, nullptr, "GetAudioOutPlayedSampleCount"},
{11, nullptr, "FlushAudioOutBuffers"},
@@ -96,7 +96,7 @@ private:
rb.PushCopyObjects(buffer_event);
}
void AppendAudioOutBuffer(Kernel::HLERequestContext& ctx) {
void AppendAudioOutBufferImpl(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service_Audio, "(STUBBED) called");
IPC::RequestParser rp{ctx};
@@ -107,7 +107,7 @@ private:
rb.Push(RESULT_SUCCESS);
}
void GetReleasedAudioOutBuffer(Kernel::HLERequestContext& ctx) {
void GetReleasedAudioOutBufferImpl(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service_Audio, "(STUBBED) called");
// TODO(st4rk): This is how libtransistor currently implements the
@@ -163,7 +163,7 @@ private:
AudioState audio_out_state;
};
void AudOutU::ListAudioOuts(Kernel::HLERequestContext& ctx) {
void AudOutU::ListAudioOutsImpl(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service_Audio, "(STUBBED) called");
IPC::RequestParser rp{ctx};
@@ -179,7 +179,7 @@ void AudOutU::ListAudioOuts(Kernel::HLERequestContext& ctx) {
rb.Push<u32>(1);
}
void AudOutU::OpenAudioOut(Kernel::HLERequestContext& ctx) {
void AudOutU::OpenAudioOutImpl(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service_Audio, "(STUBBED) called");
if (!audio_out_interface) {
@@ -196,10 +196,10 @@ void AudOutU::OpenAudioOut(Kernel::HLERequestContext& ctx) {
}
AudOutU::AudOutU() : ServiceFramework("audout:u") {
static const FunctionInfo functions[] = {{0, &AudOutU::ListAudioOuts, "ListAudioOuts"},
{1, &AudOutU::OpenAudioOut, "OpenAudioOut"},
{2, nullptr, "ListAudioOutsAuto"},
{3, nullptr, "OpenAudioOutAuto"}};
static const FunctionInfo functions[] = {{0, &AudOutU::ListAudioOutsImpl, "ListAudioOuts"},
{1, &AudOutU::OpenAudioOutImpl, "OpenAudioOut"},
{2, &AudOutU::ListAudioOutsImpl, "ListAudioOutsAuto"},
{3, &AudOutU::OpenAudioOutImpl, "OpenAudioOutAuto"}};
RegisterHandlers(functions);
}

View File

@@ -22,8 +22,8 @@ public:
private:
std::shared_ptr<IAudioOut> audio_out_interface;
void ListAudioOuts(Kernel::HLERequestContext& ctx);
void OpenAudioOut(Kernel::HLERequestContext& ctx);
void ListAudioOutsImpl(Kernel::HLERequestContext& ctx);
void OpenAudioOutImpl(Kernel::HLERequestContext& ctx);
enum class PcmFormat : u32 {
Invalid = 0,

View File

@@ -47,7 +47,7 @@ public:
// Start the audio event
CoreTiming::ScheduleEvent(audio_ticks, audio_event);
voice_status_list.reserve(worker_params.voice_count);
voice_status_list.resize(worker_params.voice_count);
}
~IAudioRenderer() {
CoreTiming::UnscheduleEvent(audio_event, 0);
@@ -87,8 +87,6 @@ private:
memory_pool[i].state = MemoryPoolStates::Attached;
else if (mem_pool_info[i].pool_state == MemoryPoolStates::RequestDetach)
memory_pool[i].state = MemoryPoolStates::Detached;
else
memory_pool[i].state = mem_pool_info[i].pool_state;
}
std::memcpy(output.data() + sizeof(UpdateDataHeader), memory_pool.data(),
response_data.memory_pools_size);
@@ -183,7 +181,9 @@ private:
behavior_size = 0xb0;
memory_pools_size = (config.effect_count + (config.voice_count * 4)) * 0x10;
voices_size = config.voice_count * 0x10;
voice_resource_size = 0x0;
effects_size = config.effect_count * 0x10;
mixes_size = 0x0;
sinks_size = config.sink_count * 0x20;
performance_manager_size = 0x10;
total_size = sizeof(UpdateDataHeader) + behavior_size + memory_pools_size +

View File

@@ -7,6 +7,7 @@
#include "common/string_util.h"
#include "core/core.h"
#include "core/file_sys/directory.h"
#include "core/file_sys/errors.h"
#include "core/file_sys/filesystem.h"
#include "core/file_sys/storage.h"
#include "core/hle/ipc_helpers.h"
@@ -531,12 +532,20 @@ void FSP_SRV::CreateSaveData(Kernel::HLERequestContext& ctx) {
void FSP_SRV::MountSaveData(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service_FS, "(STUBBED) called");
// TODO(Subv): Read the input parameters and mount the requested savedata instead of always
// mounting the current process' savedata.
FileSys::Path unused;
auto filesystem = OpenFileSystem(Type::SaveData, unused).Unwrap();
auto filesystem = OpenFileSystem(Type::SaveData, unused);
if (filesystem.Failed()) {
IPC::ResponseBuilder rb{ctx, 2, 0, 0};
rb.Push(ResultCode(ErrorModule::FS, FileSys::ErrCodes::SaveDataNotFound));
return;
}
IPC::ResponseBuilder rb{ctx, 2, 0, 1};
rb.Push(RESULT_SUCCESS);
rb.PushIpcInterface<IFileSystem>(std::move(filesystem));
rb.PushIpcInterface<IFileSystem>(std::move(filesystem.Unwrap()));
}
void FSP_SRV::GetGlobalAccessLogMode(Kernel::HLERequestContext& ctx) {

View File

@@ -18,9 +18,9 @@ namespace Service::HID {
// Updating period for each HID device.
// TODO(shinyquagsire23): These need better values.
constexpr u64 pad_update_ticks = CoreTiming::BASE_CLOCK_RATE / 10000;
constexpr u64 accelerometer_update_ticks = CoreTiming::BASE_CLOCK_RATE / 10000;
constexpr u64 gyroscope_update_ticks = CoreTiming::BASE_CLOCK_RATE / 10000;
constexpr u64 pad_update_ticks = CoreTiming::BASE_CLOCK_RATE / 100;
constexpr u64 accelerometer_update_ticks = CoreTiming::BASE_CLOCK_RATE / 100;
constexpr u64 gyroscope_update_ticks = CoreTiming::BASE_CLOCK_RATE / 100;
class IAppletResource final : public ServiceFramework<IAppletResource> {
public:

View File

@@ -148,6 +148,24 @@ private:
LOG_DEBUG(Service_NIFM, "called");
}
void IsWirelessCommunicationEnabled(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service_NIFM, "(STUBBED) called");
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<u8>(0);
}
void IsEthernetCommunicationEnabled(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service_NIFM, "(STUBBED) called");
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<u8>(0);
}
void IsAnyInternetRequestAccepted(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service_NIFM, "(STUBBED) called");
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<u8>(0);
}
};
IGeneralService::IGeneralService() : ServiceFramework("IGeneralService") {
@@ -167,11 +185,11 @@ IGeneralService::IGeneralService() : ServiceFramework("IGeneralService") {
{14, &IGeneralService::CreateTemporaryNetworkProfile, "CreateTemporaryNetworkProfile"},
{15, nullptr, "GetCurrentIpConfigInfo"},
{16, nullptr, "SetWirelessCommunicationEnabled"},
{17, nullptr, "IsWirelessCommunicationEnabled"},
{17, &IGeneralService::IsWirelessCommunicationEnabled, "IsWirelessCommunicationEnabled"},
{18, nullptr, "GetInternetConnectionStatus"},
{19, nullptr, "SetEthernetCommunicationEnabled"},
{20, nullptr, "IsEthernetCommunicationEnabled"},
{21, nullptr, "IsAnyInternetRequestAccepted"},
{20, &IGeneralService::IsEthernetCommunicationEnabled, "IsEthernetCommunicationEnabled"},
{21, &IGeneralService::IsAnyInternetRequestAccepted, "IsAnyInternetRequestAccepted"},
{22, nullptr, "IsAnyForegroundRequestAccepted"},
{23, nullptr, "PutToSleep"},
{24, nullptr, "WakeUp"},

View File

@@ -19,10 +19,9 @@ void BSD::RegisterClient(Kernel::HLERequestContext& ctx) {
void BSD::StartMonitoring(Kernel::HLERequestContext& ctx) {
LOG_WARNING(Service, "(STUBBED) called");
IPC::ResponseBuilder rb{ctx, 3};
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(RESULT_SUCCESS);
rb.Push<u32>(0); // bsd errno
}
void BSD::Socket(Kernel::HLERequestContext& ctx) {

View File

@@ -398,27 +398,6 @@ u32 Maxwell3D::GetRegisterValue(u32 method) const {
return regs.reg_array[method];
}
bool Maxwell3D::IsShaderStageEnabled(Regs::ShaderStage stage) const {
// The Vertex stage is always enabled.
if (stage == Regs::ShaderStage::Vertex)
return true;
switch (stage) {
case Regs::ShaderStage::TesselationControl:
return regs.shader_config[static_cast<size_t>(Regs::ShaderProgram::TesselationControl)]
.enable != 0;
case Regs::ShaderStage::TesselationEval:
return regs.shader_config[static_cast<size_t>(Regs::ShaderProgram::TesselationEval)]
.enable != 0;
case Regs::ShaderStage::Geometry:
return regs.shader_config[static_cast<size_t>(Regs::ShaderProgram::Geometry)].enable != 0;
case Regs::ShaderStage::Fragment:
return regs.shader_config[static_cast<size_t>(Regs::ShaderProgram::Fragment)].enable != 0;
}
UNREACHABLE();
}
void Maxwell3D::ProcessClearBuffers() {
ASSERT(regs.clear_buffers.R == regs.clear_buffers.G &&
regs.clear_buffers.R == regs.clear_buffers.B &&

View File

@@ -379,6 +379,14 @@ public:
}
};
bool IsShaderConfigEnabled(size_t index) const {
// The VertexB is always enabled.
if (index == static_cast<size_t>(Regs::ShaderProgram::VertexB)) {
return true;
}
return shader_config[index].enable != 0;
}
union {
struct {
INSERT_PADDING_WORDS(0x45);
@@ -780,9 +788,6 @@ public:
/// Returns the texture information for a specific texture in a specific shader stage.
Texture::FullTextureInfo GetStageTexture(Regs::ShaderStage stage, size_t offset) const;
/// Returns whether the specified shader stage is enabled or not.
bool IsShaderStageEnabled(Regs::ShaderStage stage) const;
private:
std::unordered_map<u32, std::vector<u32>> uploaded_macros;

View File

@@ -142,6 +142,7 @@ enum class PredCondition : u64 {
GreaterThan = 4,
NotEqual = 5,
GreaterEqual = 6,
LessThanWithNan = 9,
NotEqualWithNan = 13,
// TODO(Subv): Other condition types
};
@@ -201,6 +202,11 @@ enum class IMinMaxExchange : u64 {
XHi = 3,
};
enum class FlowCondition : u64 {
Always = 0xF,
Fcsm_Tr = 0x1C, // TODO(bunnei): What is this used for?
};
union Instruction {
Instruction& operator=(const Instruction& instr) {
value = instr.value;
@@ -297,6 +303,13 @@ union Instruction {
BitField<56, 1, u64> negate_a;
} iadd32i;
union {
BitField<53, 1, u64> negate_b;
BitField<54, 1, u64> abs_a;
BitField<56, 1, u64> negate_a;
BitField<57, 1, u64> abs_b;
} fadd32i;
union {
BitField<20, 8, u64> shift_position;
BitField<28, 8, u64> shift_length;
@@ -308,6 +321,10 @@ union Instruction {
}
} bfe;
union {
BitField<0, 5, FlowCondition> cond;
} flow;
union {
BitField<48, 1, u64> negate_b;
BitField<49, 1, u64> negate_c;
@@ -487,6 +504,7 @@ public:
FADD_C,
FADD_R,
FADD_IMM,
FADD32I,
FMUL_C,
FMUL_R,
FMUL_IMM,
@@ -679,13 +697,14 @@ private:
INST("1101101---------", Id::TLDS, Type::Memory, "TLDS"),
INST("111000110000----", Id::EXIT, Type::Trivial, "EXIT"),
INST("11100000--------", Id::IPA, Type::Trivial, "IPA"),
INST("001100101-------", Id::FFMA_IMM, Type::Ffma, "FFMA_IMM"),
INST("0011001-1-------", Id::FFMA_IMM, Type::Ffma, "FFMA_IMM"),
INST("010010011-------", Id::FFMA_CR, Type::Ffma, "FFMA_CR"),
INST("010100011-------", Id::FFMA_RC, Type::Ffma, "FFMA_RC"),
INST("010110011-------", Id::FFMA_RR, Type::Ffma, "FFMA_RR"),
INST("0100110001011---", Id::FADD_C, Type::Arithmetic, "FADD_C"),
INST("0101110001011---", Id::FADD_R, Type::Arithmetic, "FADD_R"),
INST("0011100-01011---", Id::FADD_IMM, Type::Arithmetic, "FADD_IMM"),
INST("000010----------", Id::FADD32I, Type::ArithmeticImmediate, "FADD32I"),
INST("0100110001101---", Id::FMUL_C, Type::Arithmetic, "FMUL_C"),
INST("0101110001101---", Id::FMUL_R, Type::Arithmetic, "FMUL_R"),
INST("0011100-01101---", Id::FMUL_IMM, Type::Arithmetic, "FMUL_IMM"),

View File

@@ -15,6 +15,7 @@
#include "common/microprofile.h"
#include "common/scope_exit.h"
#include "core/core.h"
#include "core/frontend/emu_window.h"
#include "core/hle/kernel/process.h"
#include "core/settings.h"
#include "video_core/engines/maxwell_3d.h"
@@ -22,6 +23,7 @@
#include "video_core/renderer_opengl/gl_shader_gen.h"
#include "video_core/renderer_opengl/maxwell_to_gl.h"
#include "video_core/renderer_opengl/renderer_opengl.h"
#include "video_core/video_core.h"
using Maxwell = Tegra::Engines::Maxwell3D::Regs;
using PixelFormat = SurfaceParams::PixelFormat;
@@ -181,6 +183,19 @@ std::pair<u8*, GLintptr> RasterizerOpenGL::SetupVertexArrays(u8* array_ptr,
return {array_ptr, buffer_offset};
}
static GLShader::ProgramCode GetShaderProgramCode(Maxwell::ShaderProgram program) {
auto& gpu = Core::System().GetInstance().GPU().Maxwell3D();
// Fetch program code from memory
GLShader::ProgramCode program_code;
auto& shader_config = gpu.regs.shader_config[static_cast<size_t>(program)];
const u64 gpu_address{gpu.regs.code_address.CodeAddress() + shader_config.offset};
const boost::optional<VAddr> cpu_address{gpu.memory_manager.GpuToCpuAddress(gpu_address)};
Memory::ReadBlock(*cpu_address, program_code.data(), program_code.size() * sizeof(u64));
return program_code;
}
void RasterizerOpenGL::SetupShaders(u8* buffer_ptr, GLintptr buffer_offset) {
// Helper function for uploading uniform data
const auto copy_buffer = [&](GLuint handle, GLintptr offset, GLsizeiptr size) {
@@ -193,26 +208,23 @@ void RasterizerOpenGL::SetupShaders(u8* buffer_ptr, GLintptr buffer_offset) {
};
auto& gpu = Core::System().GetInstance().GPU().Maxwell3D();
ASSERT_MSG(!gpu.regs.shader_config[0].enable, "VertexA is unsupported!");
// Next available bindpoints to use when uploading the const buffers and textures to the GLSL
// shaders. The constbuffer bindpoint starts after the shader stage configuration bind points.
u32 current_constbuffer_bindpoint = uniform_buffers.size();
u32 current_texture_bindpoint = 0;
for (unsigned index = 1; index < Maxwell::MaxShaderProgram; ++index) {
for (size_t index = 0; index < Maxwell::MaxShaderProgram; ++index) {
auto& shader_config = gpu.regs.shader_config[index];
const Maxwell::ShaderProgram program{static_cast<Maxwell::ShaderProgram>(index)};
const auto& stage = index - 1; // Stage indices are 0 - 5
const bool is_enabled = gpu.IsShaderStageEnabled(static_cast<Maxwell::ShaderStage>(stage));
// Skip stages that are not enabled
if (!is_enabled) {
if (!gpu.regs.IsShaderConfigEnabled(index)) {
continue;
}
const size_t stage{index == 0 ? 0 : index - 1}; // Stage indices are 0 - 5
GLShader::MaxwellUniformData ubo{};
ubo.SetFromRegs(gpu.state.shader_stages[stage]);
std::memcpy(buffer_ptr, &ubo, sizeof(ubo));
@@ -228,16 +240,21 @@ void RasterizerOpenGL::SetupShaders(u8* buffer_ptr, GLintptr buffer_offset) {
buffer_ptr += sizeof(GLShader::MaxwellUniformData);
buffer_offset += sizeof(GLShader::MaxwellUniformData);
// Fetch program code from memory
GLShader::ProgramCode program_code;
const u64 gpu_address{gpu.regs.code_address.CodeAddress() + shader_config.offset};
const boost::optional<VAddr> cpu_address{gpu.memory_manager.GpuToCpuAddress(gpu_address)};
Memory::ReadBlock(*cpu_address, program_code.data(), program_code.size() * sizeof(u64));
GLShader::ShaderSetup setup{std::move(program_code)};
GLShader::ShaderSetup setup{GetShaderProgramCode(program)};
GLShader::ShaderEntries shader_resources;
switch (program) {
case Maxwell::ShaderProgram::VertexA: {
// VertexB is always enabled, so when VertexA is enabled, we have two vertex shaders.
// Conventional HW does not support this, so we combine VertexA and VertexB into one
// stage here.
setup.SetProgramB(GetShaderProgramCode(Maxwell::ShaderProgram::VertexB));
GLShader::MaxwellVSConfig vs_config{setup};
shader_resources =
shader_program_manager->UseProgrammableVertexShader(vs_config, setup);
break;
}
case Maxwell::ShaderProgram::VertexB: {
GLShader::MaxwellVSConfig vs_config{setup};
shader_resources =
@@ -268,6 +285,12 @@ void RasterizerOpenGL::SetupShaders(u8* buffer_ptr, GLintptr buffer_offset) {
current_texture_bindpoint =
SetupTextures(static_cast<Maxwell::ShaderStage>(stage), gl_stage_program,
current_texture_bindpoint, shader_resources.texture_samplers);
// When VertexA is enabled, we have dual vertex shaders
if (program == Maxwell::ShaderProgram::VertexA) {
// VertexB was combined with VertexA, so we skip the VertexB iteration
index++;
}
}
shader_program_manager->UseTrivialGeometryShader();
@@ -301,9 +324,6 @@ std::pair<Surface, Surface> RasterizerOpenGL::ConfigureFramebuffers(bool using_c
bool using_depth_fb) {
const auto& regs = Core::System().GetInstance().GPU().Maxwell3D().regs;
// Sync the depth test state before configuring the framebuffer surfaces.
SyncDepthTestState();
// TODO(bunnei): Implement this
const bool has_stencil = false;
@@ -368,11 +388,20 @@ void RasterizerOpenGL::Clear() {
if (regs.clear_buffers.Z) {
clear_mask |= GL_DEPTH_BUFFER_BIT;
use_depth_fb = true;
// Always enable the depth write when clearing the depth buffer. The depth write mask is
// ignored when clearing the buffer in the Switch, but OpenGL obeys it so we set it to true.
state.depth.test_enabled = true;
state.depth.write_mask = GL_TRUE;
state.depth.test_func = GL_ALWAYS;
state.Apply();
}
if (clear_mask == 0)
return;
ScopeAcquireGLContext acquire_context;
auto [dirty_color_surface, dirty_depth_surface] =
ConfigureFramebuffers(use_color_fb, use_depth_fb);
@@ -399,9 +428,12 @@ void RasterizerOpenGL::DrawArrays() {
MICROPROFILE_SCOPE(OpenGL_Drawing);
const auto& regs = Core::System().GetInstance().GPU().Maxwell3D().regs;
ScopeAcquireGLContext acquire_context;
auto [dirty_color_surface, dirty_depth_surface] =
ConfigureFramebuffers(true, regs.zeta.Address() != 0);
SyncDepthTestState();
SyncBlendState();
SyncCullMode();
@@ -605,9 +637,6 @@ u32 RasterizerOpenGL::SetupConstBuffers(Maxwell::ShaderStage stage, GLuint progr
auto& gpu = Core::System::GetInstance().GPU();
auto& maxwell3d = gpu.Get3DEngine();
ASSERT_MSG(maxwell3d.IsShaderStageEnabled(stage),
"Attempted to upload constbuffer of disabled shader stage");
// Reset all buffer draw state for this stage.
for (auto& buffer : state.draw.const_buffers[static_cast<size_t>(stage)]) {
buffer.bindpoint = 0;
@@ -674,9 +703,6 @@ u32 RasterizerOpenGL::SetupTextures(Maxwell::ShaderStage stage, GLuint program,
auto& gpu = Core::System::GetInstance().GPU();
auto& maxwell3d = gpu.Get3DEngine();
ASSERT_MSG(maxwell3d.IsShaderStageEnabled(stage),
"Attempted to upload textures of disabled shader stage");
ASSERT_MSG(current_unit + entries.size() <= std::size(state.texture_units),
"Exceeded the number of active textures.");

View File

@@ -105,6 +105,7 @@ static constexpr std::array<FormatTuple, SurfaceParams::MaxPixelFormat> tex_form
{GL_COMPRESSED_RGBA_BPTC_UNORM_ARB, GL_RGB, GL_UNSIGNED_INT_8_8_8_8, ComponentType::UNorm,
true}, // BC7U
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, ComponentType::UNorm, false}, // ASTC_2D_4X4
{GL_RG8, GL_RG, GL_UNSIGNED_BYTE, ComponentType::UNorm, false}, // G8R8
// DepthStencil formats
{GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8, ComponentType::UNorm,
@@ -112,6 +113,8 @@ static constexpr std::array<FormatTuple, SurfaceParams::MaxPixelFormat> tex_form
{GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8, ComponentType::UNorm,
false}, // S8Z24
{GL_DEPTH_COMPONENT32F, GL_DEPTH_COMPONENT, GL_FLOAT, ComponentType::Float, false}, // Z32F
{GL_DEPTH_COMPONENT16, GL_DEPTH_COMPONENT, GL_UNSIGNED_SHORT, ComponentType::UNorm,
false}, // Z16
}};
static const FormatTuple& GetFormatTuple(PixelFormat pixel_format, ComponentType component_type) {
@@ -194,8 +197,9 @@ static constexpr std::array<void (*)(u32, u32, u32, u8*, Tegra::GPUVAddr),
MortonCopy<true, PixelFormat::DXT1>, MortonCopy<true, PixelFormat::DXT23>,
MortonCopy<true, PixelFormat::DXT45>, MortonCopy<true, PixelFormat::DXN1>,
MortonCopy<true, PixelFormat::BC7U>, MortonCopy<true, PixelFormat::ASTC_2D_4X4>,
MortonCopy<true, PixelFormat::Z24S8>, MortonCopy<true, PixelFormat::S8Z24>,
MortonCopy<true, PixelFormat::Z32F>,
MortonCopy<true, PixelFormat::G8R8>, MortonCopy<true, PixelFormat::Z24S8>,
MortonCopy<true, PixelFormat::S8Z24>, MortonCopy<true, PixelFormat::Z32F>,
MortonCopy<true, PixelFormat::Z16>,
};
static constexpr std::array<void (*)(u32, u32, u32, u8*, Tegra::GPUVAddr),
@@ -215,10 +219,12 @@ static constexpr std::array<void (*)(u32, u32, u32, u8*, Tegra::GPUVAddr),
nullptr,
nullptr,
nullptr,
MortonCopy<false, PixelFormat::ABGR8>,
nullptr,
MortonCopy<false, PixelFormat::G8R8>,
MortonCopy<false, PixelFormat::Z24S8>,
MortonCopy<false, PixelFormat::S8Z24>,
MortonCopy<false, PixelFormat::Z32F>,
MortonCopy<false, PixelFormat::Z16>,
};
// Allocate an uninitialized texture of appropriate size and format for the surface
@@ -271,9 +277,10 @@ static void ConvertS8Z24ToZ24S8(std::vector<u8>& data, u32 width, u32 height) {
S8Z24 input_pixel{};
Z24S8 output_pixel{};
const auto bpp{CachedSurface::GetGLBytesPerPixel(PixelFormat::S8Z24)};
for (size_t y = 0; y < height; ++y) {
for (size_t x = 0; x < width; ++x) {
const size_t offset{y * width + x};
const size_t offset{bpp * (y * width + x)};
std::memcpy(&input_pixel, &data[offset], sizeof(S8Z24));
output_pixel.s8.Assign(input_pixel.s8);
output_pixel.z24.Assign(input_pixel.z24);
@@ -281,6 +288,19 @@ static void ConvertS8Z24ToZ24S8(std::vector<u8>& data, u32 width, u32 height) {
}
}
}
static void ConvertG8R8ToR8G8(std::vector<u8>& data, u32 width, u32 height) {
const auto bpp{CachedSurface::GetGLBytesPerPixel(PixelFormat::G8R8)};
for (size_t y = 0; y < height; ++y) {
for (size_t x = 0; x < width; ++x) {
const size_t offset{bpp * (y * width + x)};
const u8 temp{data[offset]};
data[offset] = data[offset + 1];
data[offset + 1] = temp;
}
}
}
/**
* Helper function to perform software conversion (as needed) when loading a buffer from Switch
* memory. This is for Maxwell pixel formats that cannot be represented as-is in OpenGL or with
@@ -301,6 +321,11 @@ static void ConvertFormatAsNeeded_LoadGLBuffer(std::vector<u8>& data, PixelForma
// Convert the S8Z24 depth format to Z24S8, as OpenGL does not support S8Z24.
ConvertS8Z24ToZ24S8(data, width, height);
break;
case PixelFormat::G8R8:
// Convert the G8R8 color format to R8G8, as OpenGL does not support G8R8.
ConvertG8R8ToR8G8(data, width, height);
break;
}
}

View File

@@ -37,13 +37,15 @@ struct SurfaceParams {
DXN1 = 11, // This is also known as BC4
BC7U = 12,
ASTC_2D_4X4 = 13,
G8R8 = 14,
MaxColorFormat,
// DepthStencil formats
Z24S8 = 14,
S8Z24 = 15,
Z32F = 16,
Z24S8 = 15,
S8Z24 = 16,
Z32F = 17,
Z16 = 18,
MaxDepthStencilFormat,
@@ -95,9 +97,11 @@ struct SurfaceParams {
4, // DXN1
4, // BC7U
4, // ASTC_2D_4X4
1, // G8R8
1, // Z24S8
1, // S8Z24
1, // Z32F
1, // Z16
}};
ASSERT(static_cast<size_t>(format) < compression_factor_table.size());
@@ -123,9 +127,11 @@ struct SurfaceParams {
64, // DXN1
128, // BC7U
32, // ASTC_2D_4X4
16, // G8R8
32, // Z24S8
32, // S8Z24
32, // Z32F
16, // Z16
}};
ASSERT(static_cast<size_t>(format) < bpp_table.size());
@@ -143,6 +149,8 @@ struct SurfaceParams {
return PixelFormat::Z24S8;
case Tegra::DepthFormat::Z32_FLOAT:
return PixelFormat::Z32F;
case Tegra::DepthFormat::Z16_UNORM:
return PixelFormat::Z16;
default:
LOG_CRITICAL(HW_GPU, "Unimplemented format={}", static_cast<u32>(format));
UNREACHABLE();
@@ -181,6 +189,8 @@ struct SurfaceParams {
return PixelFormat::A1B5G5R5;
case Tegra::Texture::TextureFormat::R8:
return PixelFormat::R8;
case Tegra::Texture::TextureFormat::G8R8:
return PixelFormat::G8R8;
case Tegra::Texture::TextureFormat::R16_G16_B16_A16:
return PixelFormat::RGBA16F;
case Tegra::Texture::TextureFormat::BF10GF11RF11:
@@ -218,6 +228,8 @@ struct SurfaceParams {
return Tegra::Texture::TextureFormat::A1B5G5R5;
case PixelFormat::R8:
return Tegra::Texture::TextureFormat::R8;
case PixelFormat::G8R8:
return Tegra::Texture::TextureFormat::G8R8;
case PixelFormat::RGBA16F:
return Tegra::Texture::TextureFormat::R16_G16_B16_A16;
case PixelFormat::R11FG11FB10F:
@@ -249,6 +261,8 @@ struct SurfaceParams {
return Tegra::DepthFormat::Z24_S8_UNORM;
case PixelFormat::Z32F:
return Tegra::DepthFormat::Z32_FLOAT;
case PixelFormat::Z16:
return Tegra::DepthFormat::Z16_UNORM;
default:
UNREACHABLE();
}
@@ -295,6 +309,7 @@ struct SurfaceParams {
static ComponentType ComponentTypeFromDepthFormat(Tegra::DepthFormat format) {
switch (format) {
case Tegra::DepthFormat::Z16_UNORM:
case Tegra::DepthFormat::S8_Z24_UNORM:
case Tegra::DepthFormat::Z24_S8_UNORM:
return ComponentType::UNorm;

View File

@@ -42,13 +42,14 @@ enum class ExitMethod {
struct Subroutine {
/// Generates a name suitable for GLSL source code.
std::string GetName() const {
return "sub_" + std::to_string(begin) + '_' + std::to_string(end);
return "sub_" + std::to_string(begin) + '_' + std::to_string(end) + '_' + suffix;
}
u32 begin; ///< Entry point of the subroutine.
u32 end; ///< Return point of the subroutine.
ExitMethod exit_method; ///< Exit method of the subroutine.
std::set<u32> labels; ///< Addresses refereced by JMP instructions.
u32 begin; ///< Entry point of the subroutine.
u32 end; ///< Return point of the subroutine.
const std::string& suffix; ///< Suffix of the shader, used to make a unique subroutine name
ExitMethod exit_method; ///< Exit method of the subroutine.
std::set<u32> labels; ///< Addresses refereced by JMP instructions.
bool operator<(const Subroutine& rhs) const {
return std::tie(begin, end) < std::tie(rhs.begin, rhs.end);
@@ -58,11 +59,11 @@ struct Subroutine {
/// Analyzes shader code and produces a set of subroutines.
class ControlFlowAnalyzer {
public:
ControlFlowAnalyzer(const ProgramCode& program_code, u32 main_offset)
ControlFlowAnalyzer(const ProgramCode& program_code, u32 main_offset, const std::string& suffix)
: program_code(program_code) {
// Recursively finds all subroutines.
const Subroutine& program_main = AddSubroutine(main_offset, PROGRAM_END);
const Subroutine& program_main = AddSubroutine(main_offset, PROGRAM_END, suffix);
if (program_main.exit_method != ExitMethod::AlwaysEnd)
throw DecompileFail("Program does not always end");
}
@@ -77,12 +78,12 @@ private:
std::map<std::pair<u32, u32>, ExitMethod> exit_method_map;
/// Adds and analyzes a new subroutine if it is not added yet.
const Subroutine& AddSubroutine(u32 begin, u32 end) {
auto iter = subroutines.find(Subroutine{begin, end});
const Subroutine& AddSubroutine(u32 begin, u32 end, const std::string& suffix) {
auto iter = subroutines.find(Subroutine{begin, end, suffix});
if (iter != subroutines.end())
return *iter;
Subroutine subroutine{begin, end};
Subroutine subroutine{begin, end, suffix};
subroutine.exit_method = Scan(begin, end, subroutine.labels);
if (subroutine.exit_method == ExitMethod::Undetermined)
throw DecompileFail("Recursive function detected");
@@ -191,7 +192,8 @@ public:
UnsignedInteger,
};
GLSLRegister(size_t index, ShaderWriter& shader) : index{index}, shader{shader} {}
GLSLRegister(size_t index, ShaderWriter& shader, const std::string& suffix)
: index{index}, shader{shader}, suffix{suffix} {}
/// Gets the GLSL type string for a register
static std::string GetTypeString(Type type) {
@@ -216,7 +218,7 @@ public:
/// Returns a GLSL string representing the current state of the register
const std::string GetActiveString() {
declr_type.insert(active_type);
return GetPrefixString(active_type) + std::to_string(index);
return GetPrefixString(active_type) + std::to_string(index) + '_' + suffix;
}
/// Returns true if the active type is a float
@@ -251,6 +253,7 @@ private:
ShaderWriter& shader;
Type active_type{Type::Float};
std::set<Type> declr_type;
const std::string& suffix;
};
/**
@@ -262,8 +265,8 @@ private:
class GLSLRegisterManager {
public:
GLSLRegisterManager(ShaderWriter& shader, ShaderWriter& declarations,
const Maxwell3D::Regs::ShaderStage& stage)
: shader{shader}, declarations{declarations}, stage{stage} {
const Maxwell3D::Regs::ShaderStage& stage, const std::string& suffix)
: shader{shader}, declarations{declarations}, stage{stage}, suffix{suffix} {
BuildRegisterList();
}
@@ -430,12 +433,12 @@ public:
}
/// Add declarations for registers
void GenerateDeclarations() {
void GenerateDeclarations(const std::string& suffix) {
for (const auto& reg : regs) {
for (const auto& type : reg.DeclaredTypes()) {
declarations.AddLine(GLSLRegister::GetTypeString(type) + ' ' +
GLSLRegister::GetPrefixString(type) +
std::to_string(reg.GetIndex()) + " = 0;");
reg.GetPrefixString(type) + std::to_string(reg.GetIndex()) +
'_' + suffix + " = 0;");
}
}
declarations.AddNewLine();
@@ -558,7 +561,7 @@ private:
/// Build the GLSL register list.
void BuildRegisterList() {
for (size_t index = 0; index < Register::NumRegisters; ++index) {
regs.emplace_back(index, shader);
regs.emplace_back(index, shader, suffix);
}
}
@@ -620,16 +623,17 @@ private:
std::array<ConstBufferEntry, Maxwell3D::Regs::MaxConstBuffers> declr_const_buffers;
std::vector<SamplerEntry> used_samplers;
const Maxwell3D::Regs::ShaderStage& stage;
const std::string& suffix;
};
class GLSLGenerator {
public:
GLSLGenerator(const std::set<Subroutine>& subroutines, const ProgramCode& program_code,
u32 main_offset, Maxwell3D::Regs::ShaderStage stage)
u32 main_offset, Maxwell3D::Regs::ShaderStage stage, const std::string& suffix)
: subroutines(subroutines), program_code(program_code), main_offset(main_offset),
stage(stage) {
stage(stage), suffix(suffix) {
Generate();
Generate(suffix);
}
std::string GetShaderCode() {
@@ -644,7 +648,7 @@ public:
private:
/// Gets the Subroutine object corresponding to the specified address.
const Subroutine& GetSubroutine(u32 begin, u32 end) const {
auto iter = subroutines.find(Subroutine{begin, end});
auto iter = subroutines.find(Subroutine{begin, end, suffix});
ASSERT(iter != subroutines.end());
return *iter;
}
@@ -689,7 +693,7 @@ private:
// Can't assign to the constant predicate.
ASSERT(pred != static_cast<u64>(Pred::UnusedIndex));
std::string variable = 'p' + std::to_string(pred);
std::string variable = 'p' + std::to_string(pred) + '_' + suffix;
shader.AddLine(variable + " = " + value + ';');
declr_predicates.insert(std::move(variable));
}
@@ -707,7 +711,7 @@ private:
if (index == static_cast<u64>(Pred::UnusedIndex))
variable = "true";
else
variable = 'p' + std::to_string(index);
variable = 'p' + std::to_string(index) + '_' + suffix;
if (negate) {
return "!(" + variable + ')';
@@ -728,10 +732,10 @@ private:
const std::string& op_a, const std::string& op_b) const {
using Tegra::Shader::PredCondition;
static const std::unordered_map<PredCondition, const char*> PredicateComparisonStrings = {
{PredCondition::LessThan, "<"}, {PredCondition::Equal, "=="},
{PredCondition::LessEqual, "<="}, {PredCondition::GreaterThan, ">"},
{PredCondition::NotEqual, "!="}, {PredCondition::GreaterEqual, ">="},
{PredCondition::NotEqualWithNan, "!="},
{PredCondition::LessThan, "<"}, {PredCondition::Equal, "=="},
{PredCondition::LessEqual, "<="}, {PredCondition::GreaterThan, ">"},
{PredCondition::NotEqual, "!="}, {PredCondition::GreaterEqual, ">="},
{PredCondition::LessThanWithNan, "<"}, {PredCondition::NotEqualWithNan, "!="},
};
const auto& comparison{PredicateComparisonStrings.find(condition)};
@@ -739,7 +743,8 @@ private:
"Unknown predicate comparison operation");
std::string predicate{'(' + op_a + ") " + comparison->second + " (" + op_b + ')'};
if (condition == PredCondition::NotEqualWithNan) {
if (condition == PredCondition::LessThanWithNan ||
condition == PredCondition::NotEqualWithNan) {
predicate += " || isnan(" + op_a + ") || isnan(" + op_b + ')';
}
@@ -968,6 +973,29 @@ private:
regs.GetRegisterAsFloat(instr.gpr8) + " * " + GetImmediate32(instr), 1, 1);
break;
}
case OpCode::Id::FADD32I: {
std::string op_a = regs.GetRegisterAsFloat(instr.gpr8);
std::string op_b = GetImmediate32(instr);
if (instr.fadd32i.abs_a) {
op_a = "abs(" + op_a + ')';
}
if (instr.fadd32i.negate_a) {
op_a = "-(" + op_a + ')';
}
if (instr.fadd32i.abs_b) {
op_b = "abs(" + op_b + ')';
}
if (instr.fadd32i.negate_b) {
op_b = "-(" + op_b + ')';
}
regs.SetRegisterToFloat(instr.gpr0, 0, op_a + " + " + op_b, 1, 1);
break;
}
}
break;
}
@@ -1616,16 +1644,32 @@ private:
shader.AddLine("color.a = " + regs.GetRegisterAsFloat(3) + ';');
}
shader.AddLine("return true;");
if (instr.pred.pred_index == static_cast<u64>(Pred::UnusedIndex)) {
// If this is an unconditional exit then just end processing here, otherwise
// we have to account for the possibility of the condition not being met, so
// continue processing the next instruction.
offset = PROGRAM_END - 1;
switch (instr.flow.cond) {
case Tegra::Shader::FlowCondition::Always:
shader.AddLine("return true;");
if (instr.pred.pred_index == static_cast<u64>(Pred::UnusedIndex)) {
// If this is an unconditional exit then just end processing here,
// otherwise we have to account for the possibility of the condition
// not being met, so continue processing the next instruction.
offset = PROGRAM_END - 1;
}
break;
case Tegra::Shader::FlowCondition::Fcsm_Tr:
// TODO(bunnei): What is this used for? If we assume this conditon is not
// satisifed, dual vertex shaders in Farming Simulator make more sense
LOG_CRITICAL(HW_GPU, "Skipping unknown FlowCondition::Fcsm_Tr");
break;
default:
LOG_CRITICAL(HW_GPU, "Unhandled flow condition: {}",
static_cast<u32>(instr.flow.cond.Value()));
UNREACHABLE();
}
break;
}
case OpCode::Id::KIL: {
ASSERT(instr.flow.cond == Tegra::Shader::FlowCondition::Always);
shader.AddLine("discard;");
break;
}
@@ -1646,8 +1690,9 @@ private:
// can ignore this when generating GLSL code.
break;
}
case OpCode::Id::DEPBAR:
case OpCode::Id::SYNC: {
case OpCode::Id::SYNC:
ASSERT(instr.flow.cond == Tegra::Shader::FlowCondition::Always);
case OpCode::Id::DEPBAR: {
// TODO(Subv): Find out if we actually have to care about these instructions or if
// the GLSL compiler takes care of that for us.
LOG_WARNING(HW_GPU, "DEPBAR/SYNC instruction is stubbed");
@@ -1687,7 +1732,7 @@ private:
return program_counter;
}
void Generate() {
void Generate(const std::string& suffix) {
// Add declarations for all subroutines
for (const auto& subroutine : subroutines) {
shader.AddLine("bool " + subroutine.GetName() + "();");
@@ -1695,7 +1740,7 @@ private:
shader.AddNewLine();
// Add the main entry point
shader.AddLine("bool exec_shader() {");
shader.AddLine("bool exec_" + suffix + "() {");
++shader.scope;
CallSubroutine(GetSubroutine(main_offset, PROGRAM_END));
--shader.scope;
@@ -1758,7 +1803,7 @@ private:
/// Add declarations for registers
void GenerateDeclarations() {
regs.GenerateDeclarations();
regs.GenerateDeclarations(suffix);
for (const auto& pred : declr_predicates) {
declarations.AddLine("bool " + pred + " = false;");
@@ -1771,27 +1816,30 @@ private:
const ProgramCode& program_code;
const u32 main_offset;
Maxwell3D::Regs::ShaderStage stage;
const std::string& suffix;
ShaderWriter shader;
ShaderWriter declarations;
GLSLRegisterManager regs{shader, declarations, stage};
GLSLRegisterManager regs{shader, declarations, stage, suffix};
// Declarations
std::set<std::string> declr_predicates;
}; // namespace Decompiler
std::string GetCommonDeclarations() {
std::string declarations = "bool exec_shader();\n";
std::string declarations;
declarations += "#define MAX_CONSTBUFFER_ELEMENTS " +
std::to_string(RasterizerOpenGL::MaxConstbufferSize / (sizeof(GLvec4)));
declarations += '\n';
return declarations;
}
boost::optional<ProgramResult> DecompileProgram(const ProgramCode& program_code, u32 main_offset,
Maxwell3D::Regs::ShaderStage stage) {
Maxwell3D::Regs::ShaderStage stage,
const std::string& suffix) {
try {
auto subroutines = ControlFlowAnalyzer(program_code, main_offset).GetSubroutines();
GLSLGenerator generator(subroutines, program_code, main_offset, stage);
auto subroutines = ControlFlowAnalyzer(program_code, main_offset, suffix).GetSubroutines();
GLSLGenerator generator(subroutines, program_code, main_offset, stage, suffix);
return ProgramResult{generator.GetShaderCode(), generator.GetEntries()};
} catch (const DecompileFail& exception) {
LOG_ERROR(HW_GPU, "Shader decompilation failed: {}", exception.what());

View File

@@ -20,7 +20,8 @@ using Tegra::Engines::Maxwell3D;
std::string GetCommonDeclarations();
boost::optional<ProgramResult> DecompileProgram(const ProgramCode& program_code, u32 main_offset,
Maxwell3D::Regs::ShaderStage stage);
Maxwell3D::Regs::ShaderStage stage,
const std::string& suffix);
} // namespace Decompiler
} // namespace GLShader

View File

@@ -17,10 +17,17 @@ ProgramResult GenerateVertexShader(const ShaderSetup& setup, const MaxwellVSConf
std::string out = "#version 430 core\n";
out += "#extension GL_ARB_separate_shader_objects : enable\n\n";
out += Decompiler::GetCommonDeclarations();
out += "bool exec_vertex();\n";
if (setup.IsDualProgram()) {
out += "bool exec_vertex_b();\n";
}
ProgramResult program =
Decompiler::DecompileProgram(setup.program.code, PROGRAM_OFFSET,
Maxwell3D::Regs::ShaderStage::Vertex, "vertex")
.get_value_or({});
ProgramResult program = Decompiler::DecompileProgram(setup.program_code, PROGRAM_OFFSET,
Maxwell3D::Regs::ShaderStage::Vertex)
.get_value_or({});
out += R"(
out gl_PerVertex {
@@ -34,7 +41,14 @@ layout (std140) uniform vs_config {
};
void main() {
exec_shader();
exec_vertex();
)";
if (setup.IsDualProgram()) {
out += " exec_vertex_b();";
}
out += R"(
// Viewport can be flipped, which is unsupported by glViewport
position.xy *= viewport_flip.xy;
@@ -44,8 +58,19 @@ void main() {
// For now, this is here to bring order in lieu of proper emulation
position.w = 1.0;
}
)";
out += program.first;
if (setup.IsDualProgram()) {
ProgramResult program_b =
Decompiler::DecompileProgram(setup.program.code_b, PROGRAM_OFFSET,
Maxwell3D::Regs::ShaderStage::Vertex, "vertex_b")
.get_value_or({});
out += program_b.first;
}
return {out, program.second};
}
@@ -53,12 +78,13 @@ ProgramResult GenerateFragmentShader(const ShaderSetup& setup, const MaxwellFSCo
std::string out = "#version 430 core\n";
out += "#extension GL_ARB_separate_shader_objects : enable\n\n";
out += Decompiler::GetCommonDeclarations();
out += "bool exec_fragment();\n";
ProgramResult program = Decompiler::DecompileProgram(setup.program_code, PROGRAM_OFFSET,
Maxwell3D::Regs::ShaderStage::Fragment)
.get_value_or({});
ProgramResult program =
Decompiler::DecompileProgram(setup.program.code, PROGRAM_OFFSET,
Maxwell3D::Regs::ShaderStage::Fragment, "fragment")
.get_value_or({});
out += R"(
in vec4 position;
out vec4 color;
@@ -67,7 +93,7 @@ layout (std140) uniform fs_config {
};
void main() {
exec_shader();
exec_fragment();
}
)";

View File

@@ -115,21 +115,48 @@ struct ShaderEntries {
using ProgramResult = std::pair<std::string, ShaderEntries>;
struct ShaderSetup {
ShaderSetup(ProgramCode&& program_code) : program_code(std::move(program_code)) {}
ShaderSetup(const ProgramCode& program_code) {
program.code = program_code;
}
struct {
ProgramCode code;
ProgramCode code_b; // Used for dual vertex shaders
} program;
ProgramCode program_code;
bool program_code_hash_dirty = true;
u64 GetProgramCodeHash() {
if (program_code_hash_dirty) {
program_code_hash = Common::ComputeHash64(&program_code, sizeof(program_code));
program_code_hash = GetNewHash();
program_code_hash_dirty = false;
}
return program_code_hash;
}
/// Used in scenarios where we have a dual vertex shaders
void SetProgramB(const ProgramCode& program_b) {
program.code_b = program_b;
has_program_b = true;
}
bool IsDualProgram() const {
return has_program_b;
}
private:
u64 GetNewHash() const {
if (has_program_b) {
// Compute hash over dual shader programs
return Common::ComputeHash64(&program, sizeof(program));
} else {
// Compute hash over a single shader program
return Common::ComputeHash64(&program.code, program.code.size());
}
}
u64 program_code_hash{};
bool has_program_b{};
};
struct MaxwellShaderConfigCommon {

View File

@@ -92,11 +92,24 @@ static std::array<GLfloat, 3 * 2> MakeOrthographicMatrix(const float width, cons
return matrix;
}
ScopeAcquireGLContext::ScopeAcquireGLContext() {
if (Settings::values.use_multi_core) {
VideoCore::g_emu_window->MakeCurrent();
}
}
ScopeAcquireGLContext::~ScopeAcquireGLContext() {
if (Settings::values.use_multi_core) {
VideoCore::g_emu_window->DoneCurrent();
}
}
RendererOpenGL::RendererOpenGL() = default;
RendererOpenGL::~RendererOpenGL() = default;
/// Swap buffers (render frame)
void RendererOpenGL::SwapBuffers(boost::optional<const Tegra::FramebufferConfig&> framebuffer) {
ScopeAcquireGLContext acquire_context;
Core::System::GetInstance().perf_stats.EndSystemFrame();
// Maintain the rasterizer's state as a priority
@@ -418,7 +431,7 @@ static void APIENTRY DebugHandler(GLenum source, GLenum type, GLuint id, GLenum
/// Initialize the renderer
bool RendererOpenGL::Init() {
render_window->MakeCurrent();
ScopeAcquireGLContext acquire_context;
if (GLAD_GL_KHR_debug) {
glEnable(GL_DEBUG_OUTPUT);

View File

@@ -31,6 +31,13 @@ struct ScreenInfo {
TextureInfo texture;
};
/// Helper class to acquire/release OpenGL context within a given scope
class ScopeAcquireGLContext : NonCopyable {
public:
ScopeAcquireGLContext();
~ScopeAcquireGLContext();
};
class RendererOpenGL : public RendererBase {
public:
RendererOpenGL();

View File

@@ -62,6 +62,7 @@ u32 BytesPerPixel(TextureFormat format) {
return 4;
case TextureFormat::A1B5G5R5:
case TextureFormat::B5G6R5:
case TextureFormat::G8R8:
return 2;
case TextureFormat::R8:
return 1;
@@ -77,6 +78,8 @@ u32 BytesPerPixel(TextureFormat format) {
static u32 DepthBytesPerPixel(DepthFormat format) {
switch (format) {
case DepthFormat::Z16_UNORM:
return 2;
case DepthFormat::S8_Z24_UNORM:
case DepthFormat::Z24_S8_UNORM:
case DepthFormat::Z32_FLOAT:
@@ -110,6 +113,7 @@ std::vector<u8> UnswizzleTexture(VAddr address, TextureFormat format, u32 width,
case TextureFormat::A1B5G5R5:
case TextureFormat::B5G6R5:
case TextureFormat::R8:
case TextureFormat::G8R8:
case TextureFormat::R16_G16_B16_A16:
case TextureFormat::R32_G32_B32_A32:
case TextureFormat::BF10GF11RF11:
@@ -133,6 +137,7 @@ std::vector<u8> UnswizzleDepthTexture(VAddr address, DepthFormat format, u32 wid
std::vector<u8> unswizzled_data(width * height * bytes_per_pixel);
switch (format) {
case DepthFormat::Z16_UNORM:
case DepthFormat::S8_Z24_UNORM:
case DepthFormat::Z24_S8_UNORM:
case DepthFormat::Z32_FLOAT:
@@ -164,6 +169,7 @@ std::vector<u8> DecodeTexture(const std::vector<u8>& texture_data, TextureFormat
case TextureFormat::A1B5G5R5:
case TextureFormat::B5G6R5:
case TextureFormat::R8:
case TextureFormat::G8R8:
case TextureFormat::BF10GF11RF11:
case TextureFormat::R32_G32_B32_A32:
// TODO(Subv): For the time being just forward the same data without any decoding.

View File

@@ -20,7 +20,10 @@
EmuThread::EmuThread(GRenderWindow* render_window) : render_window(render_window) {}
void EmuThread::run() {
render_window->MakeCurrent();
if (!Settings::values.use_multi_core) {
// Single core mode must acquire OpenGL context for entire emulation session
render_window->MakeCurrent();
}
MicroProfileOnThreadCreate("EmuThread");

View File

@@ -14,6 +14,13 @@
namespace Debugger {
void ToggleConsole() {
static bool console_shown = false;
if (console_shown == UISettings::values.show_console) {
return;
} else {
console_shown = UISettings::values.show_console;
}
#if defined(_WIN32) && !defined(_DEBUG)
FILE* temp;
if (UISettings::values.show_console) {

View File

@@ -374,6 +374,8 @@ bool GMainWindow::LoadROM(const QString& filename) {
const Core::System::ResultStatus result{system.Load(render_window, filename.toStdString())};
render_window->DoneCurrent();
if (result != Core::System::ResultStatus::Success) {
switch (result) {
case Core::System::ResultStatus::ErrorGetLoader:
@@ -908,8 +910,6 @@ void GMainWindow::UpdateUITheme() {
#endif
int main(int argc, char* argv[]) {
Log::AddBackend(std::make_unique<Log::ColorConsoleBackend>());
MicroProfileOnThreadCreate("Frontend");
SCOPE_EXIT({ MicroProfileShutdown(); });
@@ -918,6 +918,7 @@ int main(int argc, char* argv[]) {
QCoreApplication::setApplicationName("yuzu");
QApplication::setAttribute(Qt::AA_X11InitThreads);
QApplication::setAttribute(Qt::AA_DontCheckOpenGLContextThreadAffinity);
QApplication app(argc, argv);
// Qt changes the locale and causes issues in float conversion using std::to_string() when

View File

@@ -126,7 +126,7 @@ EmuWindow_SDL2::EmuWindow_SDL2(bool fullscreen) {
SDL_WINDOW_OPENGL | SDL_WINDOW_RESIZABLE | SDL_WINDOW_ALLOW_HIGHDPI);
if (render_window == nullptr) {
LOG_CRITICAL(Frontend, "Failed to create SDL2 window! Exiting...");
LOG_CRITICAL(Frontend, "Failed to create SDL2 window! {}", SDL_GetError());
exit(1);
}
@@ -137,12 +137,12 @@ EmuWindow_SDL2::EmuWindow_SDL2(bool fullscreen) {
gl_context = SDL_GL_CreateContext(render_window);
if (gl_context == nullptr) {
LOG_CRITICAL(Frontend, "Failed to create SDL2 GL context! Exiting...");
LOG_CRITICAL(Frontend, "Failed to create SDL2 GL context! {}", SDL_GetError());
exit(1);
}
if (!gladLoadGLLoader(static_cast<GLADloadproc>(SDL_GL_GetProcAddress))) {
LOG_CRITICAL(Frontend, "Failed to initialize GL functions! Exiting...");
LOG_CRITICAL(Frontend, "Failed to initialize GL functions! {}", SDL_GetError());
exit(1);
}

View File

@@ -22,10 +22,8 @@
#include "yuzu_cmd/config.h"
#include "yuzu_cmd/emu_window/emu_window_sdl2.h"
#ifdef _MSC_VER
#include <getopt.h>
#else
#include <getopt.h>
#ifndef _MSC_VER
#include <unistd.h>
#endif
@@ -127,6 +125,7 @@ int main(int argc, char** argv) {
#endif
Log::Filter log_filter(Log::Level::Debug);
log_filter.ParseFilterString(Settings::values.log_filter);
Log::SetGlobalFilter(log_filter);
Log::AddBackend(std::make_unique<Log::ColorConsoleBackend>());
@@ -142,8 +141,6 @@ int main(int argc, char** argv) {
return -1;
}
log_filter.ParseFilterString(Settings::values.log_filter);
// Apply the command line arguments
Settings::values.gdbstub_port = gdb_port;
Settings::values.use_gdbstub = use_gdbstub;
@@ -151,6 +148,11 @@ int main(int argc, char** argv) {
std::unique_ptr<EmuWindow_SDL2> emu_window{std::make_unique<EmuWindow_SDL2>(fullscreen)};
if (!Settings::values.use_multi_core) {
// Single core mode must acquire OpenGL context for entire emulation session
emu_window->MakeCurrent();
}
Core::System& system{Core::System::GetInstance()};
SCOPE_EXIT({ system.Shutdown(); });