Compare commits

..

41 Commits

Author SHA1 Message Date
bunnei
799e632ccb gl_shader_decompiler: Fix typo with ISCADD instruction. 2018-06-04 22:41:10 -04:00
bunnei
c23c30c76f gl_shader_decompiler: Implement SHL instruction. 2018-06-04 22:36:49 -04:00
bunnei
00749f5ab3 Merge pull request #519 from bunnei/pred-not-equal
gl_shader_decompiler: Implement PredCondition::NotEqual.
2018-06-04 22:18:22 -04:00
bunnei
6ea1576513 gl_shader_decompiler: Implement PredCondition::NotEqual. 2018-06-04 22:00:47 -04:00
bunnei
81a16c073a Merge pull request #517 from Subv/iscadd
GPU: Implement the ISCADD shader instruction.
2018-06-04 21:59:55 -04:00
Subv
23b1e6eded GPU: Implement the ISCADD shader instructions. 2018-06-04 20:17:41 -05:00
Subv
438a9b70cc GPU: Added decodings for the ISCADD instructions. 2018-06-04 20:17:39 -05:00
bunnei
e8bfff7b4b Merge pull request #514 from Subv/lop32i
GPU: Implemented the LOP32I instruction.
2018-06-04 20:48:15 -04:00
bunnei
f564822e78 Merge pull request #510 from Subv/isetp
GPU: Implemented the ISETP_R and ISETP_C instructions
2018-06-04 20:47:11 -04:00
bunnei
37fd4e6d9b Merge pull request #512 from Subv/fset
GPU: Corrected the FSET and I2F instructions.
2018-06-04 19:04:20 -04:00
bunnei
cdd92dc692 Merge pull request #501 from Subv/shader_bra
GPU: Partially implemented the bra shader instruction
2018-06-04 18:31:07 -04:00
bunnei
38d25a4cb2 Merge pull request #515 from Subv/viewport_fix
GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
2018-06-04 18:11:36 -04:00
Subv
2933521a08 GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f. 2018-06-04 16:41:28 -05:00
Subv
f6679ce422 GPU: Corrected the I2F_R implementation. 2018-06-04 16:41:27 -05:00
Subv
5d55403f94 GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
This is how nouveau calculates the viewport width and height. For some reason some games set 0xFFFF in the VIEWPORT_HORIZ and VIEWPORT_VERT registers, maybe those are a misnomer and actually refer to something else?
2018-06-04 16:36:54 -05:00
bunnei
0a0233f39f Merge pull request #490 from BreadFish64/extension-check
Add checks for OpenGL extension support
2018-06-04 16:13:55 -04:00
bunnei
9936d1b9e2 Merge pull request #513 from Subv/cache_alignment
GLCache: Corrected a mismatch between storing compressed sizes and verifying the uncompressed alignment in GetSurface.
2018-06-04 16:12:55 -04:00
greggameplayer
4fad069870 Nvdrv/devices/nvhost_gpu : Add some IoctlCommands with their params (#511)
* Add some IoctlCommand with their params to nvhost_gpu

* fix clang-format

* delete trailing whitespace

* fix some clang-format

* delete one other trailing whitespace

* last clang-format fix
2018-06-04 16:12:02 -04:00
Subv
0c688b421c GPU: Implemented the LOP32I instruction. 2018-06-04 13:56:31 -05:00
Subv
cb47abecc6 GLCache: Corrected a mismatch between storing compressed sizes and verifying the uncompressed alignment in GetSurface. 2018-06-04 13:01:53 -05:00
BreadFish64
fbef849c04 sdl: add check for GL extension support 2018-06-04 12:26:41 -05:00
BreadFish64
0641950f9a qt: add check for GL extension support 2018-06-04 12:26:30 -05:00
bunnei
b7c64f0ded Merge pull request #502 from bunnei/more-am-stuff
am: Implement PopOutData, and various fixes.
2018-06-04 13:23:19 -04:00
Subv
90cddf1996 GPU: Use explicit types when retrieving the uniform values for fsetp/fset and isetp instead of the type of an invalid output register. 2018-06-04 11:22:26 -05:00
Subv
7c181fd4f4 GPU: Implemented the ISETP_R and ISETP_C shader instructions. 2018-06-04 11:12:03 -05:00
James Rowe
d16f83fda3 Merge pull request #507 from valentinvanelslande/3616
Port citra #3616
2018-06-04 10:04:18 -06:00
Valentin Vanelslande
5c82400ef8 Port citra #3616 2018-06-04 10:57:18 -05:00
bunnei
afdd2f4cad am: Implement ILibraryAppletAccessor::PopOutData. 2018-06-03 23:44:23 -04:00
bunnei
df4336a85e am: ISelfController:LaunchableEvent should be sticky. 2018-06-03 23:44:22 -04:00
bunnei
51d8a2c322 am: Stub out ILibraryAppletAccessor Start and GetResult methods. 2018-06-03 23:44:22 -04:00
bunnei
049ce242a4 Merge pull request #499 from bunnei/am-stuff
am: Implement CreateStorage, PushInData, etc.
2018-06-03 23:43:52 -04:00
Subv
b481d8a00d GPU: Partially implemented the shader BRA instruction. 2018-06-03 22:26:36 -05:00
Subv
06c72b4fcf GPU: Added decoding for the BRA instruction. 2018-06-03 22:14:00 -05:00
bunnei
876b805e50 am: Implement ILibraryAppletAccessor::PushInData. 2018-06-03 22:10:06 -04:00
bunnei
2dcb98226b am: Implement IStorageAccessor::Write. 2018-06-03 22:10:06 -04:00
bunnei
9fedfbe141 am: Cleanup IStorageAccessor::Read. 2018-06-03 22:10:06 -04:00
bunnei
d73c22bf4d am: Implement ILibraryAppletCreator::CreateStorage. 2018-06-03 22:10:05 -04:00
bunnei
ba117854f9 Merge pull request #500 from Subv/long_queries
GPU: Partial implementation of long GPU queries.
2018-06-03 21:24:50 -04:00
bunnei
527c098ff6 Merge pull request #498 from bunnei/texs-mask
gl_shader_decompiler: Implement TEXS component mask.
2018-06-03 21:22:12 -04:00
Subv
d57333406d GPU: Partial implementation of long GPU queries.
Long queries write a 128-bit result value to memory, which consists of a 64 bit query value and a 64 bit timestamp.

In this implementation, only select=Zero of the Crop unit is implemented, this writes the query sequence as a 64 bit value, and a 0u64 value for the timestamp, since we emulate an infinitely fast GPU.

This specific type was hwtested, but more rigorous tests should be performed in the future for the other types.
2018-06-03 19:17:31 -05:00
bunnei
1efcba346a gl_shader_decompiler: Implement TEXS component mask. 2018-06-03 12:08:17 -04:00
13 changed files with 540 additions and 122 deletions

View File

@@ -53,7 +53,7 @@ build_script:
# https://www.appveyor.com/docs/build-phase
msbuild msvc_build/yuzu.sln /maxcpucount /logger:"C:\Program Files\AppVeyor\BuildAgent\Appveyor.MSBuildLogger.dll"
} else {
C:\msys64\usr\bin\bash.exe -lc 'mingw32-make -C mingw_build/ 2>&1'
C:\msys64\usr\bin\bash.exe -lc 'mingw32-make -j4 -C mingw_build/ 2>&1'
}
after_build:

View File

@@ -3,6 +3,7 @@
// Refer to the license.txt file included.
#include <cinttypes>
#include <stack>
#include "core/file_sys/filesystem.h"
#include "core/hle/ipc_helpers.h"
#include "core/hle/kernel/event.h"
@@ -154,7 +155,7 @@ ISelfController::ISelfController(std::shared_ptr<NVFlinger::NVFlinger> nvflinger
RegisterHandlers(functions);
launchable_event =
Kernel::Event::Create(Kernel::ResetType::OneShot, "ISelfController:LaunchableEvent");
Kernel::Event::Create(Kernel::ResetType::Sticky, "ISelfController:LaunchableEvent");
}
void ISelfController::SetFocusHandlingMode(Kernel::HLERequestContext& ctx) {
@@ -348,19 +349,100 @@ void ICommonStateGetter::GetPerformanceMode(Kernel::HLERequestContext& ctx) {
NGLOG_WARNING(Service_AM, "(STUBBED) called");
}
class IStorageAccessor final : public ServiceFramework<IStorageAccessor> {
public:
explicit IStorageAccessor(std::vector<u8> buffer)
: ServiceFramework("IStorageAccessor"), buffer(std::move(buffer)) {
static const FunctionInfo functions[] = {
{0, &IStorageAccessor::GetSize, "GetSize"},
{10, &IStorageAccessor::Write, "Write"},
{11, &IStorageAccessor::Read, "Read"},
};
RegisterHandlers(functions);
}
private:
std::vector<u8> buffer;
void GetSize(Kernel::HLERequestContext& ctx) {
IPC::ResponseBuilder rb{ctx, 4};
rb.Push(RESULT_SUCCESS);
rb.Push(static_cast<u64>(buffer.size()));
NGLOG_DEBUG(Service_AM, "called");
}
void Write(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
const u64 offset{rp.Pop<u64>()};
const std::vector<u8> data{ctx.ReadBuffer()};
ASSERT(offset + data.size() <= buffer.size());
std::memcpy(&buffer[offset], data.data(), data.size());
IPC::ResponseBuilder rb{rp.MakeBuilder(2, 0, 0)};
rb.Push(RESULT_SUCCESS);
NGLOG_DEBUG(Service_AM, "called, offset={}", offset);
}
void Read(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
const u64 offset{rp.Pop<u64>()};
const size_t size{ctx.GetWriteBufferSize()};
ASSERT(offset + size <= buffer.size());
ctx.WriteBuffer(buffer.data() + offset, size);
IPC::ResponseBuilder rb{rp.MakeBuilder(2, 0, 0)};
rb.Push(RESULT_SUCCESS);
NGLOG_DEBUG(Service_AM, "called, offset={}", offset);
}
};
class IStorage final : public ServiceFramework<IStorage> {
public:
explicit IStorage(std::vector<u8> buffer)
: ServiceFramework("IStorage"), buffer(std::move(buffer)) {
static const FunctionInfo functions[] = {
{0, &IStorage::Open, "Open"},
{1, nullptr, "OpenTransferStorage"},
};
RegisterHandlers(functions);
}
private:
std::vector<u8> buffer;
void Open(Kernel::HLERequestContext& ctx) {
IPC::ResponseBuilder rb{ctx, 2, 0, 1};
rb.Push(RESULT_SUCCESS);
rb.PushIpcInterface<AM::IStorageAccessor>(buffer);
NGLOG_DEBUG(Service_AM, "called");
}
};
class ILibraryAppletAccessor final : public ServiceFramework<ILibraryAppletAccessor> {
public:
explicit ILibraryAppletAccessor() : ServiceFramework("ILibraryAppletAccessor") {
static const FunctionInfo functions[] = {
{0, &ILibraryAppletAccessor::GetAppletStateChangedEvent, "GetAppletStateChangedEvent"},
{1, nullptr, "IsCompleted"},
{10, nullptr, "Start"},
{10, &ILibraryAppletAccessor::Start, "Start"},
{20, nullptr, "RequestExit"},
{25, nullptr, "Terminate"},
{30, nullptr, "GetResult"},
{30, &ILibraryAppletAccessor::GetResult, "GetResult"},
{50, nullptr, "SetOutOfFocusApplicationSuspendingEnabled"},
{100, nullptr, "PushInData"},
{101, nullptr, "PopOutData"},
{100, &ILibraryAppletAccessor::PushInData, "PushInData"},
{101, &ILibraryAppletAccessor::PopOutData, "PopOutData"},
{102, nullptr, "PushExtraStorage"},
{103, nullptr, "PushInteractiveInData"},
{104, nullptr, "PopInteractiveOutData"},
@@ -388,6 +470,41 @@ private:
NGLOG_WARNING(Service_AM, "(STUBBED) called");
}
void GetResult(Kernel::HLERequestContext& ctx) {
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(RESULT_SUCCESS);
NGLOG_WARNING(Service_AM, "(STUBBED) called");
}
void Start(Kernel::HLERequestContext& ctx) {
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(RESULT_SUCCESS);
NGLOG_WARNING(Service_AM, "(STUBBED) called");
}
void PushInData(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
storage_stack.push(rp.PopIpcInterface<AM::IStorage>());
IPC::ResponseBuilder rb{rp.MakeBuilder(2, 0, 0)};
rb.Push(RESULT_SUCCESS);
NGLOG_DEBUG(Service_AM, "called");
}
void PopOutData(Kernel::HLERequestContext& ctx) {
IPC::ResponseBuilder rb{ctx, 2, 0, 1};
rb.Push(RESULT_SUCCESS);
rb.PushIpcInterface<AM::IStorage>(std::move(storage_stack.top()));
storage_stack.pop();
NGLOG_DEBUG(Service_AM, "called");
}
std::stack<std::shared_ptr<AM::IStorage>> storage_stack;
Kernel::SharedPtr<Kernel::Event> state_changed_event;
};
@@ -396,7 +513,7 @@ ILibraryAppletCreator::ILibraryAppletCreator() : ServiceFramework("ILibraryApple
{0, &ILibraryAppletCreator::CreateLibraryApplet, "CreateLibraryApplet"},
{1, nullptr, "TerminateAllLibraryApplets"},
{2, nullptr, "AreAnyLibraryAppletsLeft"},
{10, nullptr, "CreateStorage"},
{10, &ILibraryAppletCreator::CreateStorage, "CreateStorage"},
{11, nullptr, "CreateTransferMemoryStorage"},
{12, nullptr, "CreateHandleStorage"},
};
@@ -412,72 +529,17 @@ void ILibraryAppletCreator::CreateLibraryApplet(Kernel::HLERequestContext& ctx)
NGLOG_DEBUG(Service_AM, "called");
}
class IStorageAccessor final : public ServiceFramework<IStorageAccessor> {
public:
explicit IStorageAccessor(std::vector<u8> buffer)
: ServiceFramework("IStorageAccessor"), buffer(std::move(buffer)) {
static const FunctionInfo functions[] = {
{0, &IStorageAccessor::GetSize, "GetSize"},
{10, nullptr, "Write"},
{11, &IStorageAccessor::Read, "Read"},
};
RegisterHandlers(functions);
}
void ILibraryAppletCreator::CreateStorage(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
const u64 size{rp.Pop<u64>()};
std::vector<u8> buffer(size);
private:
std::vector<u8> buffer;
IPC::ResponseBuilder rb{rp.MakeBuilder(2, 0, 1)};
rb.Push(RESULT_SUCCESS);
rb.PushIpcInterface<AM::IStorage>(std::move(buffer));
void GetSize(Kernel::HLERequestContext& ctx) {
IPC::ResponseBuilder rb{ctx, 4};
rb.Push(RESULT_SUCCESS);
rb.Push(static_cast<u64>(buffer.size()));
NGLOG_DEBUG(Service_AM, "called");
}
void Read(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
u64 offset = rp.Pop<u64>();
const size_t size{ctx.GetWriteBufferSize()};
ASSERT(offset + size <= buffer.size());
ctx.WriteBuffer(buffer.data() + offset, size);
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(RESULT_SUCCESS);
NGLOG_DEBUG(Service_AM, "called");
}
};
class IStorage final : public ServiceFramework<IStorage> {
public:
explicit IStorage(std::vector<u8> buffer)
: ServiceFramework("IStorage"), buffer(std::move(buffer)) {
static const FunctionInfo functions[] = {
{0, &IStorage::Open, "Open"},
{1, nullptr, "OpenTransferStorage"},
};
RegisterHandlers(functions);
}
private:
std::vector<u8> buffer;
void Open(Kernel::HLERequestContext& ctx) {
IPC::ResponseBuilder rb{ctx, 2, 0, 1};
rb.Push(RESULT_SUCCESS);
rb.PushIpcInterface<AM::IStorageAccessor>(buffer);
NGLOG_DEBUG(Service_AM, "called");
}
};
NGLOG_DEBUG(Service_AM, "called, size={}", size);
}
IApplicationFunctions::IApplicationFunctions() : ServiceFramework("IApplicationFunctions") {
static const FunctionInfo functions[] = {

View File

@@ -121,6 +121,7 @@ public:
private:
void CreateLibraryApplet(Kernel::HLERequestContext& ctx);
void CreateStorage(Kernel::HLERequestContext& ctx);
};
class IApplicationFunctions final : public ServiceFramework<IApplicationFunctions> {

View File

@@ -156,16 +156,15 @@ void Maxwell3D::ProcessQueryGet() {
// TODO(Subv): Support the other query units.
ASSERT_MSG(regs.query.query_get.unit == Regs::QueryUnit::Crop,
"Units other than CROP are unimplemented");
ASSERT_MSG(regs.query.query_get.short_query,
"Writing the entire query result structure is unimplemented");
u32 value = Memory::Read32(*address);
u32 result = 0;
u64 result = 0;
// TODO(Subv): Support the other query variables
switch (regs.query.query_get.select) {
case Regs::QuerySelect::Zero:
result = 0;
// This seems to actually write the query sequence to the query address.
result = regs.query.query_sequence;
break;
default:
UNIMPLEMENTED_MSG("Unimplemented query select type {}",
@@ -174,15 +173,31 @@ void Maxwell3D::ProcessQueryGet() {
// TODO(Subv): Research and implement how query sync conditions work.
struct LongQueryResult {
u64_le value;
u64_le timestamp;
};
static_assert(sizeof(LongQueryResult) == 16, "LongQueryResult has wrong size");
switch (regs.query.query_get.mode) {
case Regs::QueryMode::Write:
case Regs::QueryMode::Write2: {
// Write the current query sequence to the sequence address.
u32 sequence = regs.query.query_sequence;
Memory::Write32(*address, sequence);
// TODO(Subv): Write the proper query response structure to the address when not using short
// mode.
if (regs.query.query_get.short_query) {
// Write the current query sequence to the sequence address.
// TODO(Subv): Find out what happens if you use a long query type but mark it as a short
// query.
Memory::Write32(*address, sequence);
} else {
// Write the 128-bit result structure in long mode. Note: We emulate an infinitely fast
// GPU, this command may actually take a while to complete in real hardware due to GPU
// wait queues.
LongQueryResult query_result{};
query_result.value = result;
// TODO(Subv): Generate a real GPU timestamp and write it here instead of 0
query_result.timestamp = 0;
Memory::WriteBlock(*address, &query_result, sizeof(query_result));
}
break;
}
default:

View File

@@ -354,10 +354,35 @@ public:
f32 scale_x;
f32 scale_y;
f32 scale_z;
u32 translate_x;
u32 translate_y;
u32 translate_z;
f32 translate_x;
f32 translate_y;
f32 translate_z;
INSERT_PADDING_WORDS(2);
MathUtil::Rectangle<s32> GetRect() const {
return {
GetX(), // left
GetY() + GetHeight(), // top
GetX() + GetWidth(), // right
GetY() // bottom
};
};
s32 GetX() const {
return static_cast<s32>(std::max(0.0f, translate_x - std::fabs(scale_x)));
}
s32 GetY() const {
return static_cast<s32>(std::max(0.0f, translate_y - std::fabs(scale_y)));
}
s32 GetWidth() const {
return static_cast<s32>(translate_x + std::fabs(scale_x)) - GetX();
}
s32 GetHeight() const {
return static_cast<s32>(translate_y + std::fabs(scale_y)) - GetY();
}
} viewport_transform[NumViewports];
struct {
@@ -371,15 +396,6 @@ public:
};
float depth_range_near;
float depth_range_far;
MathUtil::Rectangle<s32> GetRect() const {
return {
static_cast<s32>(x), // left
static_cast<s32>(y + height), // top
static_cast<s32>(x + width), // right
static_cast<s32>(y) // bottom
};
};
} viewport[NumViewports];
INSERT_PADDING_WORDS(0x1D);

View File

@@ -156,6 +156,13 @@ enum class PredOperation : u64 {
Xor = 2,
};
enum class LogicOperation : u64 {
And = 0,
Or = 1,
Xor = 2,
PassB = 3,
};
enum class SubOp : u64 {
Cos = 0x0,
Sin = 0x1,
@@ -202,6 +209,12 @@ union Instruction {
BitField<42, 1, u64> negate_pred;
} fmnmx;
union {
BitField<53, 2, LogicOperation> operation;
BitField<55, 1, u64> invert_a;
BitField<56, 1, u64> invert_b;
} lop;
float GetImm20_19() const {
float result{};
u32 imm{static_cast<u32>(imm20_19)};
@@ -217,8 +230,21 @@ union Instruction {
std::memcpy(&result, &imm, sizeof(imm));
return result;
}
s32 GetSignedImm20_20() const {
u32 immediate = static_cast<u32>(imm20_19 | (negate_imm << 19));
// Sign extend the 20-bit value.
u32 mask = 1U << (20 - 1);
return static_cast<s32>((immediate ^ mask) - mask);
}
} alu;
union {
BitField<39, 5, u64> shift_amount;
BitField<48, 1, u64> negate_b;
BitField<49, 1, u64> negate_a;
} iscadd;
union {
BitField<48, 1, u64> negate_b;
BitField<49, 1, u64> negate_c;
@@ -238,6 +264,16 @@ union Instruction {
BitField<56, 1, u64> neg_b;
} fsetp;
union {
BitField<0, 3, u64> pred0;
BitField<3, 3, u64> pred3;
BitField<39, 3, u64> pred39;
BitField<42, 1, u64> neg_pred;
BitField<45, 2, PredOperation> op;
BitField<48, 1, u64> is_signed;
BitField<49, 3, PredCondition> cond;
} isetp;
union {
BitField<39, 3, u64> pred39;
BitField<42, 1, u64> neg_pred;
@@ -245,9 +281,9 @@ union Instruction {
BitField<44, 1, u64> abs_b;
BitField<45, 2, PredOperation> op;
BitField<48, 4, PredCondition> cond;
BitField<52, 1, u64> bf;
BitField<53, 1, u64> neg_b;
BitField<54, 1, u64> abs_a;
BitField<52, 1, u64> bf;
BitField<55, 1, u64> ftz;
BitField<56, 1, u64> neg_imm;
} fset;
@@ -270,10 +306,37 @@ union Instruction {
} tex;
union {
// TODO(bunnei): This is just a guess, needs to be verified
BitField<52, 1, u64> enable_g_component;
BitField<50, 3, u64> component_mask_selector;
BitField<28, 8, Register> gpr28;
bool HasTwoDestinations() const {
return gpr28.Value() != Register::ZeroIndex;
}
bool IsComponentEnabled(size_t component) const {
static constexpr std::array<size_t, 5> one_dest_mask{0x1, 0x2, 0x4, 0x8, 0x3};
static constexpr std::array<size_t, 5> two_dest_mask{0x7, 0xb, 0xd, 0xe, 0xf};
const auto& mask{HasTwoDestinations() ? two_dest_mask : one_dest_mask};
ASSERT(component_mask_selector < mask.size());
return ((1 << component) & mask[component_mask_selector]) != 0;
}
} texs;
union {
BitField<20, 5, u64> target;
BitField<5, 1, u64> constant_buffer;
s32 GetBranchTarget() const {
// Sign extend the branch target offset
u32 mask = 1U << (5 - 1);
u32 value = static_cast<u32>(target);
// The branch offset is relative to the next instruction, so add 1 to it.
return static_cast<s32>((value ^ mask) - mask) + 1;
}
} bra;
BitField<61, 1, u64> is_b_imm;
BitField<60, 1, u64> is_b_gpr;
BitField<59, 1, u64> is_c_gpr;
@@ -292,6 +355,7 @@ class OpCode {
public:
enum class Id {
KIL,
BRA,
LD_A,
ST_A,
TEX,
@@ -311,6 +375,9 @@ public:
FMUL_R,
FMUL_IMM,
FMUL32_IMM,
ISCADD_C, // Scale and Add
ISCADD_R,
ISCADD_IMM,
MUFU, // Multi-Function Operator
RRO_C, // Range Reduction Operator
RRO_R,
@@ -332,6 +399,9 @@ public:
MOV_R,
MOV_IMM,
MOV32_IMM,
SHL_C,
SHL_R,
SHL_IMM,
SHR_C,
SHR_R,
SHR_IMM,
@@ -353,6 +423,9 @@ public:
enum class Type {
Trivial,
Arithmetic,
Logic,
Shift,
ScaledAdd,
Ffma,
Flow,
Memory,
@@ -456,6 +529,7 @@ private:
std::vector<Matcher> table = {
#define INST(bitstring, op, type, name) Detail::GetMatcher(bitstring, op, type, name)
INST("111000110011----", Id::KIL, Type::Flow, "KIL"),
INST("111000100100----", Id::BRA, Type::Flow, "BRA"),
INST("1110111111011---", Id::LD_A, Type::Memory, "LD_A"),
INST("1110111111110---", Id::ST_A, Type::Memory, "ST_A"),
INST("1100000000111---", Id::TEX, Type::Memory, "TEX"),
@@ -475,6 +549,9 @@ private:
INST("0101110001101---", Id::FMUL_R, Type::Arithmetic, "FMUL_R"),
INST("0011100-01101---", Id::FMUL_IMM, Type::Arithmetic, "FMUL_IMM"),
INST("00011110--------", Id::FMUL32_IMM, Type::Arithmetic, "FMUL32_IMM"),
INST("0100110000011---", Id::ISCADD_C, Type::ScaledAdd, "ISCADD_C"),
INST("0101110000011---", Id::ISCADD_R, Type::ScaledAdd, "ISCADD_R"),
INST("0011100-00011---", Id::ISCADD_IMM, Type::ScaledAdd, "ISCADD_IMM"),
INST("0101000010000---", Id::MUFU, Type::Arithmetic, "MUFU"),
INST("0100110010010---", Id::RRO_C, Type::Arithmetic, "RRO_C"),
INST("0101110010010---", Id::RRO_R, Type::Arithmetic, "RRO_R"),
@@ -485,17 +562,20 @@ private:
INST("0100110010110---", Id::F2I_C, Type::Arithmetic, "F2I_C"),
INST("0101110010110---", Id::F2I_R, Type::Arithmetic, "F2I_R"),
INST("0011100-10110---", Id::F2I_IMM, Type::Arithmetic, "F2I_IMM"),
INST("000001----------", Id::LOP32I, Type::Arithmetic, "LOP32I"),
INST("0100110010011---", Id::MOV_C, Type::Arithmetic, "MOV_C"),
INST("0101110010011---", Id::MOV_R, Type::Arithmetic, "MOV_R"),
INST("0011100-10011---", Id::MOV_IMM, Type::Arithmetic, "MOV_IMM"),
INST("000000010000----", Id::MOV32_IMM, Type::Arithmetic, "MOV32_IMM"),
INST("0100110000101---", Id::SHR_C, Type::Arithmetic, "SHR_C"),
INST("0101110000101---", Id::SHR_R, Type::Arithmetic, "SHR_R"),
INST("0011100-00101---", Id::SHR_IMM, Type::Arithmetic, "SHR_IMM"),
INST("0100110001100---", Id::FMNMX_C, Type::Arithmetic, "FMNMX_C"),
INST("0101110001100---", Id::FMNMX_R, Type::Arithmetic, "FMNMX_R"),
INST("0011100-01100---", Id::FMNMX_IMM, Type::Arithmetic, "FMNMX_IMM"),
INST("000001----------", Id::LOP32I, Type::Logic, "LOP32I"),
INST("0100110001001---", Id::SHL_C, Type::Shift, "SHL_C"),
INST("0101110001001---", Id::SHL_R, Type::Shift, "SHL_R"),
INST("0011100-01001---", Id::SHL_IMM, Type::Shift, "SHL_IMM"),
INST("0100110000101---", Id::SHR_C, Type::Shift, "SHR_C"),
INST("0101110000101---", Id::SHR_R, Type::Shift, "SHR_R"),
INST("0011100-00101---", Id::SHR_IMM, Type::Shift, "SHR_IMM"),
INST("0100110011100---", Id::I2I_C, Type::Conversion, "I2I_C"),
INST("0101110011100---", Id::I2I_R, Type::Conversion, "I2I_R"),
INST("01110001-1000---", Id::I2I_IMM, Type::Conversion, "I2I_IMM"),

View File

@@ -298,7 +298,7 @@ void RasterizerOpenGL::DrawArrays() {
const bool has_stencil = false;
const bool using_color_fb = true;
const bool using_depth_fb = false;
const MathUtil::Rectangle<s32> viewport_rect{regs.viewport[0].GetRect()};
const MathUtil::Rectangle<s32> viewport_rect{regs.viewport_transform[0].GetRect()};
const bool write_color_fb =
state.color_mask.red_enabled == GL_TRUE || state.color_mask.green_enabled == GL_TRUE ||
@@ -702,7 +702,7 @@ void RasterizerOpenGL::BindFramebufferSurfaces(const Surface& color_surface,
void RasterizerOpenGL::SyncViewport(const MathUtil::Rectangle<u32>& surfaces_rect, u16 res_scale) {
const auto& regs = Core::System().GetInstance().GPU().Maxwell3D().regs;
const MathUtil::Rectangle<s32> viewport_rect{regs.viewport[0].GetRect()};
const MathUtil::Rectangle<s32> viewport_rect{regs.viewport_transform[0].GetRect()};
state.viewport.x = static_cast<GLint>(surfaces_rect.left) + viewport_rect.left * res_scale;
state.viewport.y = static_cast<GLint>(surfaces_rect.bottom) + viewport_rect.bottom * res_scale;

View File

@@ -933,7 +933,8 @@ Surface RasterizerCacheOpenGL::GetSurface(const SurfaceParams& params, ScaleMatc
// Use GetSurfaceSubRect instead
ASSERT(params.width == params.stride);
ASSERT(!params.is_tiled || (params.width % 8 == 0 && params.height % 8 == 0));
ASSERT(!params.is_tiled ||
(params.GetActualWidth() % 8 == 0 && params.GetActualHeight() % 8 == 0));
// Check for an exact match in existing surfaces
Surface surface =

View File

@@ -88,6 +88,20 @@ private:
return *subroutines.insert(std::move(subroutine)).first;
}
/// Merges exit method of two parallel branches.
static ExitMethod ParallelExit(ExitMethod a, ExitMethod b) {
if (a == ExitMethod::Undetermined) {
return b;
}
if (b == ExitMethod::Undetermined) {
return a;
}
if (a == b) {
return a;
}
return ExitMethod::Conditional;
}
/// Scans a range of code for labels and determines the exit method.
ExitMethod Scan(u32 begin, u32 end, std::set<u32>& labels) {
auto [iter, inserted] =
@@ -97,11 +111,19 @@ private:
return exit_method;
for (u32 offset = begin; offset != end && offset != PROGRAM_END; ++offset) {
if (const auto opcode = OpCode::Decode({program_code[offset]})) {
const Instruction instr = {program_code[offset]};
if (const auto opcode = OpCode::Decode(instr)) {
switch (opcode->GetId()) {
case OpCode::Id::EXIT: {
return exit_method = ExitMethod::AlwaysEnd;
}
case OpCode::Id::BRA: {
u32 target = offset + instr.bra.GetBranchTarget();
labels.insert(target);
ExitMethod no_jmp = Scan(offset + 1, end, labels);
ExitMethod jmp = Scan(target, end, labels);
return exit_method = ParallelExit(no_jmp, jmp);
}
}
}
}
@@ -197,6 +219,11 @@ public:
return active_type == Type::Integer;
}
/// Returns the current active type of the register
Type GetActiveType() const {
return active_type;
}
/// Returns the index of the register
size_t GetIndex() const {
return index;
@@ -328,22 +355,28 @@ public:
shader.AddLine(dest + " = " + src + ';');
}
/// Generates code representing a uniform (C buffer) register.
std::string GetUniform(const Uniform& uniform, const Register& dest_reg) {
/// Generates code representing a uniform (C buffer) register, interpreted as the input type.
std::string GetUniform(const Uniform& uniform, GLSLRegister::Type type) {
declr_const_buffers[uniform.index].MarkAsUsed(static_cast<unsigned>(uniform.index),
static_cast<unsigned>(uniform.offset), stage);
std::string value =
'c' + std::to_string(uniform.index) + '[' + std::to_string(uniform.offset) + ']';
if (regs[dest_reg].IsFloat()) {
if (type == GLSLRegister::Type::Float) {
return value;
} else if (regs[dest_reg].IsInteger()) {
} else if (type == GLSLRegister::Type::Integer) {
return "floatBitsToInt(" + value + ')';
} else {
UNREACHABLE();
}
}
/// Generates code representing a uniform (C buffer) register, interpreted as the type of the
/// destination register.
std::string GetUniform(const Uniform& uniform, const Register& dest_reg) {
return GetUniform(uniform, regs[dest_reg].GetActiveType());
}
/// Add declarations for registers
void GenerateDeclarations() {
for (const auto& reg : regs) {
@@ -612,9 +645,9 @@ private:
std::string GetPredicateComparison(Tegra::Shader::PredCondition condition) const {
using Tegra::Shader::PredCondition;
static const std::unordered_map<PredCondition, const char*> PredicateComparisonStrings = {
{PredCondition::LessThan, "<"}, {PredCondition::Equal, "=="},
{PredCondition::LessEqual, "<="}, {PredCondition::GreaterThan, ">"},
{PredCondition::GreaterEqual, ">="},
{PredCondition::LessThan, "<"}, {PredCondition::Equal, "=="},
{PredCondition::LessEqual, "<="}, {PredCondition::GreaterThan, ">"},
{PredCondition::NotEqual, "!="}, {PredCondition::GreaterEqual, ">="},
};
auto comparison = PredicateComparisonStrings.find(condition);
@@ -808,6 +841,102 @@ private:
}
break;
}
case OpCode::Type::Logic: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8, 0, false);
if (instr.alu.lop.invert_a)
op_a = "~(" + op_a + ')';
switch (opcode->GetId()) {
case OpCode::Id::LOP32I: {
u32 imm = static_cast<u32>(instr.alu.imm20_32.Value());
if (instr.alu.lop.invert_b)
imm = ~imm;
switch (instr.alu.lop.operation) {
case Tegra::Shader::LogicOperation::And: {
regs.SetRegisterToInteger(instr.gpr0, false, 0,
'(' + op_a + " & " + std::to_string(imm) + ')', 1, 1);
break;
}
case Tegra::Shader::LogicOperation::Or: {
regs.SetRegisterToInteger(instr.gpr0, false, 0,
'(' + op_a + " | " + std::to_string(imm) + ')', 1, 1);
break;
}
case Tegra::Shader::LogicOperation::Xor: {
regs.SetRegisterToInteger(instr.gpr0, false, 0,
'(' + op_a + " ^ " + std::to_string(imm) + ')', 1, 1);
break;
}
default:
NGLOG_CRITICAL(HW_GPU, "Unimplemented lop32i operation: {}",
static_cast<u32>(instr.alu.lop.operation.Value()));
UNREACHABLE();
}
break;
}
default: {
NGLOG_CRITICAL(HW_GPU, "Unhandled logic instruction: {}", opcode->GetName());
UNREACHABLE();
}
}
break;
}
case OpCode::Type::Shift: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8, 0, false);
std::string op_b;
if (instr.is_b_imm) {
op_b += '(' + std::to_string(instr.alu.GetSignedImm20_20()) + ')';
} else {
if (instr.is_b_gpr) {
op_b += regs.GetRegisterAsInteger(instr.gpr20);
} else {
op_b += regs.GetUniform(instr.uniform, GLSLRegister::Type::Integer);
}
}
switch (opcode->GetId()) {
case OpCode::Id::SHL_C:
case OpCode::Id::SHL_R:
case OpCode::Id::SHL_IMM:
regs.SetRegisterToInteger(instr.gpr0, true, 0, op_a + " << " + op_b, 1, 1);
break;
default: {
NGLOG_CRITICAL(HW_GPU, "Unhandled shift instruction: {}", opcode->GetName());
UNREACHABLE();
}
}
break;
}
case OpCode::Type::ScaledAdd: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8);
if (instr.iscadd.negate_a)
op_a = '-' + op_a;
std::string op_b = instr.iscadd.negate_b ? "-" : "";
if (instr.is_b_imm) {
op_b += '(' + std::to_string(instr.alu.GetSignedImm20_20()) + ')';
} else {
if (instr.is_b_gpr) {
op_b += regs.GetRegisterAsInteger(instr.gpr20);
} else {
op_b += regs.GetUniform(instr.uniform, GLSLRegister::Type::Integer);
}
}
std::string shift = std::to_string(instr.iscadd.shift_amount.Value());
regs.SetRegisterToInteger(instr.gpr0, true, 0,
"((" + op_a + " << " + shift + ") + " + op_b + ')', 1, 1);
break;
}
case OpCode::Type::Ffma: {
std::string op_a = regs.GetRegisterAsFloat(instr.gpr8);
std::string op_b = instr.ffma.negate_b ? "-" : "";
@@ -849,8 +978,7 @@ private:
ASSERT_MSG(!instr.conversion.saturate_a, "Unimplemented");
switch (opcode->GetId()) {
case OpCode::Id::I2I_R:
case OpCode::Id::I2F_R: {
case OpCode::Id::I2I_R: {
ASSERT_MSG(!instr.conversion.selector, "Unimplemented");
std::string op_a =
@@ -863,6 +991,17 @@ private:
regs.SetRegisterToInteger(instr.gpr0, instr.conversion.is_signed, 0, op_a, 1, 1);
break;
}
case OpCode::Id::I2F_R: {
std::string op_a =
regs.GetRegisterAsInteger(instr.gpr20, 0, instr.conversion.is_signed);
if (instr.conversion.abs_a) {
op_a = "abs(" + op_a + ')';
}
regs.SetRegisterToFloat(instr.gpr0, 0, op_a, 1, 1);
break;
}
case OpCode::Id::F2F_R: {
std::string op_a = regs.GetRegisterAsFloat(instr.gpr20);
@@ -938,18 +1077,21 @@ private:
// TEXS has two destination registers. RG goes into gpr0+0 and gpr0+1, and BA goes
// into gpr28+0 and gpr28+1
size_t offset{};
for (const auto& dest : {instr.gpr0.Value(), instr.gpr28.Value()}) {
for (unsigned elem = 0; elem < 2; ++elem) {
if (dest + elem >= Register::ZeroIndex) {
// Skip invalid register values
break;
if (!instr.texs.IsComponentEnabled(elem)) {
// Skip disabled components
continue;
}
regs.SetRegisterToFloat(dest, elem + offset, texture, 1, 4, false, elem);
if (!instr.texs.enable_g_component) {
// Skip the second component
break;
}
}
if (!instr.texs.HasTwoDestinations()) {
// Skip the second destination
break;
}
offset += 2;
}
--shader.scope;
@@ -983,7 +1125,7 @@ private:
if (instr.is_b_gpr) {
op_b += regs.GetRegisterAsFloat(instr.gpr20);
} else {
op_b += regs.GetUniform(instr.uniform, instr.gpr0);
op_b += regs.GetUniform(instr.uniform, GLSLRegister::Type::Float);
}
}
@@ -1014,6 +1156,42 @@ private:
}
break;
}
case OpCode::Type::IntegerSetPredicate: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8, 0, instr.isetp.is_signed);
std::string op_b{};
ASSERT_MSG(!instr.is_b_imm, "ISETP_IMM not implemented");
if (instr.is_b_gpr) {
op_b += regs.GetRegisterAsInteger(instr.gpr20, 0, instr.isetp.is_signed);
} else {
op_b += regs.GetUniform(instr.uniform, GLSLRegister::Type::Integer);
}
using Tegra::Shader::Pred;
// We can't use the constant predicate as destination.
ASSERT(instr.isetp.pred3 != static_cast<u64>(Pred::UnusedIndex));
std::string second_pred =
GetPredicateCondition(instr.isetp.pred39, instr.isetp.neg_pred != 0);
std::string comparator = GetPredicateComparison(instr.isetp.cond);
std::string combiner = GetPredicateCombiner(instr.isetp.op);
std::string predicate = '(' + op_a + ") " + comparator + " (" + op_b + ')';
// Set the primary predicate to the result of Predicate OP SecondPredicate
SetPredicate(instr.isetp.pred3,
'(' + predicate + ") " + combiner + " (" + second_pred + ')');
if (instr.isetp.pred0 != static_cast<u64>(Pred::UnusedIndex)) {
// Set the secondary predicate to the result of !Predicate OP SecondPredicate,
// if enabled
SetPredicate(instr.isetp.pred0,
"!(" + predicate + ") " + combiner + " (" + second_pred + ')');
}
break;
}
case OpCode::Type::FloatSet: {
std::string op_a = instr.fset.neg_a ? "-" : "";
op_a += regs.GetRegisterAsFloat(instr.gpr8);
@@ -1034,7 +1212,7 @@ private:
if (instr.is_b_gpr) {
op_b += regs.GetRegisterAsFloat(instr.gpr20);
} else {
op_b += regs.GetUniform(instr.uniform, instr.gpr0);
op_b += regs.GetUniform(instr.uniform, GLSLRegister::Type::Float);
}
}
@@ -1053,7 +1231,12 @@ private:
std::string predicate = "(((" + op_a + ") " + comparator + " (" + op_b + ")) " +
combiner + " (" + second_pred + "))";
regs.SetRegisterToFloat(instr.gpr0, 0, predicate + " ? 1.0 : 0.0", 1, 1);
if (instr.fset.bf) {
regs.SetRegisterToFloat(instr.gpr0, 0, predicate + " ? 1.0 : 0.0", 1, 1);
} else {
regs.SetRegisterToInteger(instr.gpr0, false, 0, predicate + " ? 0xFFFFFFFF : 0", 1,
1);
}
break;
}
default: {
@@ -1078,6 +1261,13 @@ private:
shader.AddLine("discard;");
break;
}
case OpCode::Id::BRA: {
ASSERT_MSG(instr.bra.constant_buffer == 0,
"BRA with constant buffers are not implemented");
u32 target = offset + instr.bra.GetBranchTarget();
shader.AddLine("{ jmp_to = " + std::to_string(target) + "u; break; }");
break;
}
case OpCode::Id::IPA: {
const auto& attribute = instr.attribute.fmt28;
regs.SetRegisterToInputAttibute(instr.gpr0, attribute.element, attribute.index);

View File

@@ -335,6 +335,24 @@ void GMainWindow::OnDisplayTitleBars(bool show) {
}
}
bool GMainWindow::SupportsRequiredGLExtensions() {
QStringList unsupported_ext;
if (!GLAD_GL_ARB_program_interface_query)
unsupported_ext.append("ARB_program_interface_query");
if (!GLAD_GL_ARB_separate_shader_objects)
unsupported_ext.append("ARB_separate_shader_objects");
if (!GLAD_GL_ARB_shader_storage_buffer_object)
unsupported_ext.append("ARB_shader_storage_buffer_object");
if (!GLAD_GL_ARB_vertex_attrib_binding)
unsupported_ext.append("ARB_vertex_attrib_binding");
for (const QString& ext : unsupported_ext)
NGLOG_CRITICAL(Frontend, "Unsupported GL extension: {}", ext.toStdString());
return unsupported_ext.empty();
}
bool GMainWindow::LoadROM(const QString& filename) {
// Shutdown previous session if the emu thread is still active...
if (emu_thread != nullptr)
@@ -350,6 +368,14 @@ bool GMainWindow::LoadROM(const QString& filename) {
return false;
}
if (!SupportsRequiredGLExtensions()) {
QMessageBox::critical(
this, tr("Error while initializing OpenGL Core!"),
tr("Your GPU may not support one or more required OpenGL extensions. Please "
"ensure you have the latest graphics driver. See the log for more details."));
return false;
}
Core::System& system{Core::System::GetInstance()};
system.SetGPUDebugContext(debug_context);

View File

@@ -79,6 +79,7 @@ private:
void ConnectWidgetEvents();
void ConnectMenuEvents();
bool SupportsRequiredGLExtensions();
bool LoadROM(const QString& filename);
void BootGame(const QString& filename);
void ShutdownGame();

View File

@@ -78,6 +78,24 @@ void EmuWindow_SDL2::Fullscreen() {
SDL_MaximizeWindow(render_window);
}
bool EmuWindow_SDL2::SupportsRequiredGLExtensions() {
std::vector<std::string> unsupported_ext;
if (!GLAD_GL_ARB_program_interface_query)
unsupported_ext.push_back("ARB_program_interface_query");
if (!GLAD_GL_ARB_separate_shader_objects)
unsupported_ext.push_back("ARB_separate_shader_objects");
if (!GLAD_GL_ARB_shader_storage_buffer_object)
unsupported_ext.push_back("ARB_shader_storage_buffer_object");
if (!GLAD_GL_ARB_vertex_attrib_binding)
unsupported_ext.push_back("ARB_vertex_attrib_binding");
for (const std::string& ext : unsupported_ext)
NGLOG_CRITICAL(Frontend, "Unsupported GL extension: {}", ext);
return unsupported_ext.empty();
}
EmuWindow_SDL2::EmuWindow_SDL2(bool fullscreen) {
InputCommon::Init();
@@ -128,6 +146,11 @@ EmuWindow_SDL2::EmuWindow_SDL2(bool fullscreen) {
exit(1);
}
if (!SupportsRequiredGLExtensions()) {
NGLOG_CRITICAL(Frontend, "GPU does not support all required OpenGL extensions! Exiting...");
exit(1);
}
OnResize();
OnMinimalClientAreaChangeRequest(GetActiveConfig().min_client_area_size);
SDL_PumpEvents();

View File

@@ -46,6 +46,9 @@ private:
/// Called when user passes the fullscreen parameter flag
void Fullscreen();
/// Whether the GPU and driver supports the OpenGL extension required
bool SupportsRequiredGLExtensions();
/// Called when a configuration change affects the minimal size of the window
void OnMinimalClientAreaChangeRequest(
const std::pair<unsigned, unsigned>& minimal_size) override;