Commit graph

5197 commits

Author SHA1 Message Date
Wunkolo
4569f39c7c common: Replace common_sizes into user-literals
Removes common_sizes.h in favor of having `_KiB`, `_MiB`, `_GiB`, etc
user-literals within literals.h.

To keep the global namespace clean, users will have to use:

```
using namespace Common::Literals;
```

to access these literals.
2021-06-24 09:27:40 -07:00
bunnei
1b09d6628b
Merge pull request #6517 from lioncash/fmtlib
externals: Update fmt to 8.0.0
2021-06-23 15:31:04 -07:00
Lioncash
d0b1f2bd05 General: Resolve fmt specifiers to adhere to 8.0.0 API where applicable
Also removes some deprecated API usages.
2021-06-23 13:48:21 -04:00
bunnei
d8d9bb0dfb
Merge pull request #6518 from lioncash/func
maxwell3d: Add missing return in default SizeInBytes() case
2021-06-23 09:43:00 -07:00
Lioncash
be6844c1ed maxwell3d: Add missing return in default SizeInBytes() case
We were returning '1' in ComponentCount()'s default case but were
neglecting to do the same with SizeInBytes().
2021-06-23 11:50:40 -04:00
Mai M
17fff10e06
Merge pull request #6465 from FernandoS27/sex-on-the-beach
GPU: Implement a garbage collector for GPU Caches (project Reaper+)
2021-06-23 08:03:01 -04:00
Mai M
20f474b09a
Merge pull request #6508 from ReinUsesLisp/bootmanager-stop-token
bootmanager: Use std::stop_source for stopping emulation
2021-06-23 02:35:42 -04:00
Fernando Sahmkow
f9b940a442 Reaper: Set minimum cleaning limit on OGL. 2021-06-22 22:07:17 +02:00
Morph
81b1b71993 common: fs: Remove [[nodiscard]] attribute on Remove* functions
There are a lot of scenarios where we don't particularly care whether or not the removal operation and just simply attempt a removal.

As such, removing the [[nodiscard]] attribute is best for these functions.
2021-06-22 13:36:24 -04:00
ReinUsesLisp
4009ae1da2 bootmanager: Use std::stop_source for stopping emulation
Use its std::stop_token to abort shader cache loading.

Using std::stop_token instead of std::atomic_bool allows the usage of
other utilities like std::stop_callback.
2021-06-22 00:04:57 -03:00
ReinUsesLisp
cf116a28a6 vk_master_semaphore: Use jthread for debug thread 2021-06-21 19:56:07 -03:00
lat9nq
a01459df3d gl_device: Expand on Mesa driver names
Makes this list a bit more capable at identifying Mesa drivers. Tries to
deal with two of the overloaded vendor strings in a more generic
fashion.
2021-06-20 23:04:07 -04:00
ameerj
fb16cbb17e video_core: Add GPU vendor name to window title bar 2021-06-20 23:04:07 -04:00
Fernando Sahmkow
569a1962c0 Reaper: Guarantee correct deletion. 2021-06-20 19:11:41 +02:00
ameerj
851c76233d util_shaders: Specify ASTC decoder memory barrier bits 2021-06-19 11:16:25 -04:00
ameerj
ace20ba4a4 astc_decoder.comp: Remove unnecessary LUT SSBOs
We can move them to instead be compile time constants within the shader.
2021-06-19 10:56:13 -04:00
ameerj
31b125ef57 astc: Various robustness enhancements for the gpu decoder
These changes should help in reducing crashes/drivers panics that may
occur due to synchronization issues between the shader completion and
later access of the decoded texture.
2021-06-19 09:00:33 -04:00
ameerj
0b172d12c0 vulkan_debug_callback: Skip logging known false-positive validation errors
Avoids overwhelming the log with validation errors that are not applicable
2021-06-17 22:16:32 -04:00
Fernando Sahmkow
719a6dd5a1 Reaper: Correct size calculation on Vulkan. 2021-06-17 08:48:41 +02:00
Ameer J
c5b517aa5f
Merge pull request #6469 from ReinUsesLisp/blit-view-compat
texture_cache/util: Avoid relaxed image views on different bytes per block
2021-06-16 21:08:07 -04:00
Fernando Sahmkow
ca6f47c686 Reaper: Change memory restrictions on TC depending on host memory on VK. 2021-06-17 00:29:48 +02:00
Fernando Sahmkow
0dd98842bf Reaper: Address Feedback. 2021-06-16 21:35:03 +02:00
Fernando Sahmkow
954ad2a61e Reaper: Setup settings and final tuning. 2021-06-16 21:35:03 +02:00
Fernando Sahmkow
d8ad6aa187 Reaper: Tune it up to be an smart GC. 2021-06-16 21:35:02 +02:00
ReinUsesLisp
a11bc4a382 Initial Reaper Setup
WIP
2021-06-16 21:35:02 +02:00
ReinUsesLisp
5b1efe522e vulkan_memory_allocator: Release allocations with no commits 2021-06-16 21:35:01 +02:00
ameerj
5fc8393125 astc_decoder: Fix LDR CEM1 endpoint calculation
Per the spec, L1 is clamped to the value 0xff if it is greater than 0xff. An oversight caused us to take the maximum of L1 and 0xff, rather than the minimum.

Huge thanks to wwylele for finding this.

Co-Authored-By: Weiyi Wang <wwylele@gmail.com>
2021-06-15 20:19:01 -04:00
ameerj
b2955479e5 configure_graphics: Add Accelerate ASTC decoding setting 2021-06-15 20:19:00 -04:00
ameerj
c4ff7ecf51 textures: Reintroduce CPU ASTC decoder
Users may want to fall back to the CPU ASTC texture decoder due to hangs
and crashes that may be caused by keeping the GPU under compute heavy
loads for extended periods of time. This is especially the case in games
such as Astral Chain which make extensive use of ASTC textures.
2021-06-15 20:19:00 -04:00
ReinUsesLisp
3d89398b84 texture_cache/util: Avoid relaxed image views on different bytes per pixel
Avoids API usage errors on UE4 titles leading to crashes.
2021-06-14 21:03:57 -03:00
lat9nq
932c0184a7 cmake: Fix find_program usage for 3.15
yuzu requires CMake 3.15 yet find_program was using REQUIRED, which is
only available on 3.18 and later. Instead, we check for
"<VAR>-NOTFOUND".

In addition, check for additional requirements before building libusb or
FFmpeg with autotools. Otherwise, CMake configuration will pass yet
compilation will fail.
2021-06-13 01:15:54 -04:00
Fernando Sahmkow
588ab44470 GPUTHread: Remove async reads from Normal Accuracy. 2021-06-11 17:27:17 +02:00
ReinUsesLisp
7b0d8bd1fb rasterizer: Update pages in batches 2021-06-11 17:27:17 +02:00
Markus Wick
6755025310 Fix GCC undefined behavior sanitizer.
* Wrong alignment in u64 LOG_DEBUG -> memcpy.
* Huge shift exponent in stride calculation for linear buffer, unused result -> skipped.
* Large shift in buffer cache if word = 0, skip checking for set bits.

Non of those were critical, so this should not change any behavior.
At least with the assumption, that the last one used masking behavior, which always yield continuous_bits = 0.
2021-06-10 21:07:27 +02:00
bunnei
df91c9f5e6
Merge pull request #6410 from lat9nq/avoid-oob
decoders: Avoid out-of-bounds access
2021-06-07 10:51:17 -07:00
lat9nq
287a0f72a5 decoders: Break instead of continue
continue causes a memory leak in A Hat in Time.
2021-06-04 05:12:14 -04:00
lat9nq
1feefabeba decoders: Avoid out-of-bounds access
This is not a real fix, so assert here and continue before crashing.
2021-06-04 05:03:54 -04:00
ameerj
859ba21f6d buffer_cache: Simplify uniform disabling logic 2021-06-01 13:26:58 -04:00
bunnei
0a6f685ad0
Merge pull request #6367 from ReinUsesLisp/vma-host
vulkan_memory_allocator: Allow textures to be allocated in host memory
2021-05-31 23:35:11 -07:00
bunnei
8592f8a2b4 video_core: gpu: WaitFence: Do not block threads during shutdown.
- Fixes a hang on shutdown when NVFlinger thread is waiting on a syncpoint that will never occur.
- Commonly observed when stopping emulation in Super Mario Odyssey.
2021-05-29 01:06:04 -07:00
Markus Wick
5a8cd1b118 Fix two GCC 11 warnings: Unneeded copies.
std::move created an unneeded copy.
iterating without reference also created copies.
2021-05-29 08:57:44 +02:00
bunnei
4b95b0df97 video_core: rasterizer_cache: Use u16 for cached page count.
- Greatly reduces the risk of overflow, at the cost of doubling the size of this array.
2021-05-27 14:47:24 -07:00
ReinUsesLisp
19454e71d8 vulkan_memory_allocator: Allow textures to be allocated in host memory
Allow Vulkan's allocator to use host memory when there's no more device
local memory. This delays OOM, but it will eventually still happen.
2021-05-27 05:50:48 -03:00
Morph
065867e2c2
common: fs: Rework the Common Filesystem interface to make use of std::filesystem (#6270)
* common: fs: fs_types: Create filesystem types

Contains various filesystem types used by the Common::FS library

* common: fs: fs_util: Add std::string to std::u8string conversion utility

* common: fs: path_util: Add utlity functions for paths

Contains various utility functions for getting or manipulating filesystem paths used by the Common::FS library

* common: fs: file: Rewrite the IOFile implementation

* common: fs: Reimplement Common::FS library using std::filesystem

* common: fs: fs_paths: Add fs_paths to replace common_paths

* common: fs: path_util: Add the rest of the path functions

* common: Remove the previous Common::FS implementation

* general: Remove unused fs includes

* string_util: Remove unused function and include

* nvidia_flags: Migrate to the new Common::FS library

* settings: Migrate to the new Common::FS library

* logging: backend: Migrate to the new Common::FS library

* core: Migrate to the new Common::FS library

* perf_stats: Migrate to the new Common::FS library

* reporter: Migrate to the new Common::FS library

* telemetry_session: Migrate to the new Common::FS library

* key_manager: Migrate to the new Common::FS library

* bis_factory: Migrate to the new Common::FS library

* registered_cache: Migrate to the new Common::FS library

* xts_archive: Migrate to the new Common::FS library

* service: acc: Migrate to the new Common::FS library

* applets/profile: Migrate to the new Common::FS library

* applets/web: Migrate to the new Common::FS library

* service: filesystem: Migrate to the new Common::FS library

* loader: Migrate to the new Common::FS library

* gl_shader_disk_cache: Migrate to the new Common::FS library

* nsight_aftermath_tracker: Migrate to the new Common::FS library

* vulkan_library: Migrate to the new Common::FS library

* configure_debug: Migrate to the new Common::FS library

* game_list_worker: Migrate to the new Common::FS library

* config: Migrate to the new Common::FS library

* configure_filesystem: Migrate to the new Common::FS library

* configure_per_game_addons: Migrate to the new Common::FS library

* configure_profile_manager: Migrate to the new Common::FS library

* configure_ui: Migrate to the new Common::FS library

* input_profiles: Migrate to the new Common::FS library

* yuzu_cmd: config: Migrate to the new Common::FS library

* yuzu_cmd: Migrate to the new Common::FS library

* vfs_real: Migrate to the new Common::FS library

* vfs: Migrate to the new Common::FS library

* vfs_libzip: Migrate to the new Common::FS library

* service: bcat: Migrate to the new Common::FS library

* yuzu: main: Migrate to the new Common::FS library

* vfs_real: Delete the contents of an existing file in CreateFile

Current usages of CreateFile expect to delete the contents of an existing file, retain this behavior for now.

* input_profiles: Don't iterate the input profile dir if it does not exist

Silences an error produced in the log if the directory does not exist.

* game_list_worker: Skip parsing file if the returned VfsFile is nullptr

Prevents crashes in GetLoader when the virtual file is nullptr

* common: fs: Validate paths for path length

* service: filesystem: Open the mod load directory as read only
2021-05-25 19:32:56 -04:00
bunnei
5068279f23
Merge pull request #6248 from A-w-x/intelmesa
gl_device: Intel: Disable texture view formats workaround on mesa
2021-05-20 23:47:14 -07:00
bunnei
7d86a6ff02
Merge pull request #6317 from ameerj/fps-fix
perf_stats: Rework FPS counter to be more accurate
2021-05-18 19:56:29 -07:00
bunnei
93bc59b62d
Merge pull request #6322 from ameerj/fast-null-buffer
buffer_cache: Ensure null buffers cannot take the fast uniform bind path
2021-05-17 15:45:36 -07:00
ameerj
acf22336ec buffer_cache: Ensure null buffers cannot take the fast uniform bind path
Fixes a crash in New Pokemon Snap
2021-05-16 07:43:40 -04:00
bunnei
a1138028a8
Merge pull request #6289 from ameerj/oob-blit
texture_cache: Handle out of bound texture blits
2021-05-15 21:32:37 -07:00
ameerj
5bef54618a perf_stats: Rework FPS counter to be more accurate
The FPS counter was based on metrics in the nvdisp swapbuffers call. This metric would be accurate if the gpu thread/renderer were synchronous with the nvdisp service, but that's no longer the case.

This commit moves the frame counting responsibility onto the concrete renderers after their frame draw calls. Resulting in more meaningful metrics.
The displayed FPS is now made up of the average framerate between the previous and most recent update, in order to avoid distracting FPS counter updates when framerate is oscillating between close values.

The status bar update frequency was also changed from 2 seconds to 500ms.
2021-05-15 20:34:20 -04:00
ameerj
3671fd0a97 texture_cache: Handle out of bound texture blits
Some games interleave a texture blit using regions which are out-of-bounds. This addresses the interleaving to avoid oob reads from the src texture.
2021-05-07 22:14:21 -04:00
bunnei
2a7eff57a8 hle: kernel: Rename Process to KProcess. 2021-05-05 16:40:52 -07:00
A-w-x
6a2084a204 gl_device: Intel: Disable texture view formats workaround on mesa 2021-04-26 18:14:10 +02:00
bunnei
3c5fb53634
Merge pull request #6237 from ameerj/nvdec-end-fix
nvhost_vic: Fix device closure
2021-04-25 23:05:58 -07:00
ameerj
ae758a236f vk_texture_cache: Swap R and B channels of color flipped format
Swaps the Red and Blue channels of the A1B5G5R5_UNORM texture format, which was being incorrectly rendered.
2021-04-24 23:59:42 -04:00
ameerj
75e0d16caa nvhost_vic: Fix device closure
Implements the OnClose method of the nvhost_vic device, and removes the remnants of an older implementation.

Also cleans up some of the surrounding code.
2021-04-24 19:22:09 -04:00
Lioncash
17b7f0389a texture_cache/util: Fix src being used instead of dst within DeduceBlitImages
This line can only ever be reached if src is null, so dereferencing it
here is a logic bug that slipped through.

Instead, we dereference dst instead which is guaranteed to be valid.
2021-04-19 13:01:50 -04:00
bunnei
9ad77ba6d3
Merge pull request #6125 from ogniK5377/nvdec-close-dev
nvdrv: Cleanup CDMA Processor on device closure
2021-04-16 23:14:44 -07:00
Chloe Marcec
edb1d5d242 Address issues 2021-04-16 13:52:32 +10:00
bunnei
de5bf640b7
Merge pull request #6196 from bunnei/asserts-setting
core: settings: Add setting for debug assertions and disable by default.
2021-04-14 17:47:18 -07:00
bunnei
a4c6712a4b common: Move settings to common from core.
- Removes a dependency on core and input_common from common.
2021-04-14 16:24:03 -07:00
bunnei
8146c8c5e7
Merge pull request #6191 from lioncash/vdtor
engine_interface: Add missing virtual destructor
2021-04-13 19:59:10 -07:00
bunnei
12a343ed8d
Merge pull request #6190 from lioncash/constfn2
vk_master_semaphore: Add missing const qualifier for IsFree()
2021-04-13 17:52:38 -07:00
bunnei
62b560e8e3
Merge pull request #6188 from lioncash/bits
vk_texture_cache: Make use of bit_cast where applicable
2021-04-13 16:44:49 -07:00
bunnei
154eb3cfbe
Merge pull request #6187 from lioncash/sign-conv
texure_cache/util: Resolve implicit sign conversions with std::reduce
2021-04-13 09:46:32 -07:00
Lioncash
31932904c5 engine_interface: Add missing virtual destructor
Eliminates a potential bug vector related to inheritance. Plus, we
should generally be specifying the destructor as virtual within purely
virtual interfaces to begin with.
2021-04-12 09:53:55 -04:00
Lioncash
9b331a5fb5 vk_master_semaphore: Deduplicate atomic access within IsFree()
We can just reuse the already existing KnownGpuTick() to deduplicate the
access.
2021-04-12 09:41:55 -04:00
Lioncash
c5f5d6e7f6 vk_master_semaphore: Add missing const qualifier for IsFree()
This member function doesn't modify class state.
2021-04-12 09:41:23 -04:00
Lioncash
4198c92ed0 vk_texture_cache: Make use of Common::BitCast where applicable
Also clarify the TODO comment a little more on the lacking
implementations for std::bit_cast.
2021-04-12 09:17:36 -04:00
Lioncash
fddb278aa3 texure_cache/util: Resolve implicit sign conversions with std::reduce
Amends implicit sign conversions occurring with usages of std::reduce
and also relocates it to its own utility function to reduce verbosity a
little bit.
2021-04-12 05:21:53 -04:00
Lioncash
4209588505 query_cache: Make use of std::erase_if
Same behavior, but much more straightforward to read.
2021-04-12 04:51:18 -04:00
Rodrigo Locatti
ddbd1387aa
Merge pull request #6181 from Joshua-Ashton/robustness_features
vulkan_device: Enable EXT_robustness2 features
2021-04-11 20:42:14 -03:00
Joshua Ashton
0ec6cb942d
vk_buffer_cache: Fix offset for NULL vertex buffers
The Vulkan spec states:
If an element of pBuffers is VK_NULL_HANDLE, then the corresponding element of pOffsets must be zero.

https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/vkCmdBindVertexBuffers2EXT.html#VUID-vkCmdBindVertexBuffers2EXT-pBuffers-04112
2021-04-11 10:34:52 +01:00
Joshua Ashton
08337a492d
vulkan_device: Enable EXT_robustness2 features
When this was being made mandatory, these enablement of these features was removed, but this is still needed.

Fixes: 757fd1e917 ("vulkan_device: Require VK_EXT_robustness2")
2021-04-11 09:48:38 +01:00
Joshua Ashton
bcf58c8210
renderer_vulkan: Check return value of AcquireNextImage
We can get into a really bad state by ignoring this
leading to device loss and using incorrect resources.
2021-04-11 09:27:50 +01:00
Markus Wick
e8bd9aed8b video_core: Use a CV for blocking commands.
There is no need for a busy loop here. Let's just use a condition variable to save some power.
2021-04-07 22:38:52 +02:00
Markus Wick
e6fb49fa4b video_core/gpu_thread: Keep the write lock for allocating the fence.
Else the fence might get submited out-of-order into the queue, which makes testing them pointless.
Overhead should be tiny as the mutex is just moved from the queue to the writing code.
2021-04-07 22:38:52 +02:00
Markus Wick
5145133a60 video_core/gpu_thread: Implement a ShutDown method.
This was implicitly done by `is_powered_on = false`, however the explicit method allows us to block until the GPU is actually gone.

This should fix a race condition while removing the other subsystems while the GPU is still active.
2021-04-07 22:38:52 +02:00
Markus Wick
4aec060f6d common/threadsafe_queue: Provide Wait() method.
It shall block until there is something to consume in the queue.

And use it for the GPU emulation instead of the spin loop.
This is only in booting the emulator, however in BOTW this is the case for about 1 second.
2021-04-07 22:38:52 +02:00
lat9nq
a60653dcd3 vp9: Avoid memcpy with null pointers
Avoid sending null pointer to memcpy as reported by Undefined Behaviour
Sanitizer. Replaces the std::memcpy calls in SpliceVectors with
std::copy calls. Opting to replace all the memcpy's with copy's.

Co-authored-by: LC <mathew1800@gmail.com>
2021-04-05 00:44:38 -04:00
Rodrigo Locatti
5ee669466f
Merge pull request #5927 from ameerj/astc-compute
video_core: Accelerate ASTC texture decoding using compute shaders
2021-03-30 19:31:52 -03:00
Chloe Marcec
bf1c1788ca nvdrv: Cleanup CDMA Processor on device closure
Brings us a step closer to unifying all channels to share a common interface.
2021-03-30 20:37:40 +11:00
Jan Beich
9b50b23a50 vulkan_common: enable OpenGL interop on other Unices 2021-03-30 00:25:25 +00:00
ameerj
2f83d9a61b astc_decoder: Refactor for style and more efficient memory use 2021-03-25 16:53:51 -04:00
Jan Beich
8c016b02e7 gl_device: unblock async shaders on other Unix systems
Mesa is the primary OpenGL provider on all FreeDesktop systems.
For example, iris is used on Intel GPU + FreeBSD by default.
2021-03-24 19:59:20 +00:00
lat9nq
538f097f97 gl_device: Block async shaders on AMD and Intel
Currently, the Windows versions of the Intel OpenGL driver and the AMD
proprietary OpenGL driver do not properly support (or in fact degrade)
when asynchronous shader compilation is enabled. This blocks
specifically those drivers from using this feature. This affects
AMDGPU-PRO on Linux, and AMD's and Intel's OpenGL drivers on Windows.
2021-03-21 01:25:45 -04:00
Rodrigo Locatti
2f30c10584 astc_decoder: Reimplement Layers
Reimplements the approach to decoding layers in the compute shader. Fixes multilayer astc decoding when using Vulkan.
2021-03-13 12:16:03 -05:00
ameerj
c7553abe89 astc_decoder: Fix out of bounds memory access
resolves a crash with some anamolous textures found in Astral Chain.
2021-03-13 12:16:03 -05:00
ameerj
20eb368e14 renderer_vulkan: Accelerate ASTC decoding
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-03-13 12:16:03 -05:00
ameerj
f6566338eb host_shaders: Modify shader cmake integration to allow for larger shaders
using a raw string to encapsulate the entire shader code limits us to shaders of size less than 2KB. This change overcomes this limitation.
2021-03-13 12:16:03 -05:00
ameerj
2985e5e94c renderer_opengl: Accelerate ASTC texture decoding with a compute shader
ASTC texture decoding is currently handled by a CPU decoder for GPU's without native ASTC decoding support (most desktop GPUs). This is the cause for noticeable performance degradation in titles which use the format extensively.

This commit adds support to accelerate ASTC decoding using a compute shader on OpenGL for GPUs without native support.
2021-03-13 12:16:03 -05:00
bunnei
4735d18bb9
Merge pull request #6028 from bunnei/raster-cache
video_core: rasterizer_accelerated: Use a flat array instead of interval_map for cached pages.
2021-03-12 21:57:27 -08:00
bunnei
a9d24b0df3 video_core: rasterizer_accelerated: Fix un/signed mismatch. 2021-03-12 21:52:49 -08:00
Rodrigo Locatti
daf5c5060b
Merge pull request #5891 from ameerj/bgra-ogl
renderer_opengl: Use compute shaders to swizzle BGR textures on copy
2021-03-09 02:47:51 -03:00
bunnei
d1a7b2eca7
Merge pull request #6021 from ReinUsesLisp/skip-cache-heuristic
buffer_cache: Heuristically decide to skip cache on uniform buffers
2021-03-08 17:48:55 -08:00
ameerj
5213f70230 texture_cache: Blacklist BGRA8 copies and views on OpenGL
In order to force the BGRA8 conversion on Nvidia using OpenGL, we need to forbid texture copies and views with other formats.

This commit also adds a boolean relating to this, as this needs to be done only for the OpenGL api, Vulkan must remain unchanged.
2021-03-04 14:14:49 -05:00
ameerj
0639244d85 renderer_opengl: Swizzle BGR textures on copy
OpenGL does not natively support BGR internal formats, which causes many BGR textures to render incorrectly, with Red and Blue channels swapped.

This commit aims to address this by swizzling the blue and red channels on texture copies when a BGR format is encountered.
2021-03-04 14:14:19 -05:00
bunnei
b8b5891585
Merge pull request #5989 from ReinUsesLisp/cmdpool
vk_command_pool: Reduce the command pool size from 4096 to 4
2021-03-04 11:07:31 -08:00
bunnei
50ee9c46ab video_core: rasterizer_accelerated: Fix delta check ordering. 2021-03-02 17:48:02 -08:00
bunnei
6ab839462c video_core: rasterizer_accelerated: Improve error handling & fix implicit conversion. 2021-03-02 17:44:02 -08:00