stuff/suyu - qilk git

stuff/suyu

mirror of https://git.suyu.dev/suyu/suyu.git synced 2024-11-25 04:46:27 -05:00

Author	SHA1	Message	Date
ReinUsesLisp	94915d4ea1	vk_graphics_pipeline: Set front facing properly Front face was being forced to a certain value when cull face is disabled. Set a default value on initialization and drop the forcefully set front facing value with culling disabled.	2020-01-18 18:50:47 -03:00
bunnei	9bf4850f74	Merge pull request #3305 from ReinUsesLisp/point-size-program gl_state: Implement PROGRAM_POINT_SIZE	2020-01-18 01:56:32 -05:00
bunnei	15163edaaa	Merge pull request #3312 from ReinUsesLisp/atoms-u32 shader/memory: Implement ATOMS.ADD.U32	2020-01-18 00:54:07 -05:00
ReinUsesLisp	09b1d762d7	vk_rasterizer: Address feedback	2020-01-17 21:40:01 -03:00
ReinUsesLisp	f34e519da3	gl_shader_decompiler: Fix decompilation of condition codes Use Visit instead of reimplementing it. Fixes unimplemented negations for condition codes.	2020-01-17 21:23:01 -03:00
bunnei	48863afb65	Merge pull request #3306 from ReinUsesLisp/gl-texture gl_texture_cache: Minor fixes and style changes	2020-01-17 15:44:02 -05:00
bunnei	657b3a366e	Merge pull request #3311 from ReinUsesLisp/z32fx24s8 format_lookup_table: Fix ZF32_X24S8 component types	2020-01-17 08:22:32 -05:00
ReinUsesLisp	fe5356d223	vk_rasterizer: Implement Vulkan's rasterizer This abstraction is Vulkan's equivalent to OpenGL's rasterizer. It takes care of joining all parts of the backend and rendering accordingly on demand.	2020-01-16 23:05:15 -03:00
ReinUsesLisp	38e789c761	renderer_vulkan: Add header as placeholder	2020-01-16 22:54:15 -03:00
bunnei	e041f33569	Merge pull request #3300 from ReinUsesLisp/vk-texture-cache vk_texture_cache: Implement generic texture cache on Vulkan	2020-01-16 19:19:26 -05:00
ReinUsesLisp	f09cd52980	vk_texture_cache: Address feedback	2020-01-16 18:23:10 -03:00
ReinUsesLisp	63ba41a26d	shader/memory: Implement ATOMS.ADD.U32	2020-01-16 17:30:55 -03:00
ReinUsesLisp	0caab54b5d	format_lookup_table: Fix ZF32_X24S8 component types Component types for ZF32_X24S8 were using UNORM. Drivers will set FLOAT, UINT, UNORM, UNORM; causing a format mismatch. This commit addresses that.	2020-01-16 17:29:13 -03:00
Rodrigo Locatti	82e1285c1e	vk_texture_cache: Fix typo in commentary Co-Authored-By: MysticExile <30736337+MysticExile@users.noreply.github.com>	2020-01-16 16:59:46 -03:00
bunnei	30faf6a964	Merge pull request #3308 from lioncash/private maxwell_3d: Make dirty_pointers private	2020-01-16 13:26:35 -05:00
bunnei	d23869811d	Merge pull request #3304 from lioncash/fwd-decl renderer_opengl/utils: Forward declare private structs	2020-01-16 11:21:18 -05:00
Lioncash	9e874898f5	maxwell_3d: Make dirty_pointers private This isn't used outside of the class itself, so we can make it private for the time being.	2020-01-16 04:07:15 -05:00
ReinUsesLisp	c375d735e6	gl_state: Implement PROGRAM_POINT_SIZE For gl_PointSize to have effect we have to activate GL_PROGRAM_POINT_SIZE.	2020-01-15 16:14:17 -03:00
Lioncash	7af56dfa76	renderer_opengl/utils: Remove unused header inclusions Nothing from these headers are used, so they can be removed.	2020-01-15 06:31:23 -05:00
Lioncash	06d30fbcca	renderer_opengl/utils: Forward declare private structs Keeps the definitions hidden and allows changes to the structs without needing to recompile all users of classes containing said structs.	2020-01-15 06:30:01 -05:00
ReinUsesLisp	66a1c777c9	gl_texture_cache: Use local variables to simplify DownloadTexture	2020-01-14 17:39:48 -03:00
ReinUsesLisp	cdb00546f0	gl_texture_cache: Fix format for RGBX16F	2020-01-14 17:38:33 -03:00
ReinUsesLisp	2d09467f6f	gl_texture_cache: Use Snorm internal format for RG8S	2020-01-14 17:37:58 -03:00
ReinUsesLisp	02624c35ec	gl_texture_cache: Use Snorm internal format for ABGR8S	2020-01-14 17:37:23 -03:00
Rodrigo Locatti	64cd46579b	Merge pull request #3303 from lioncash/reorder control_flow: Silence -Wreorder warning for CFGRebuildState	2020-01-14 16:15:18 -03:00
Lioncash	a1eee1749e	control_flow: Silence -Wreorder warning for CFGRebuildState Organizes the initializer list in the same order that the variables would actually be initialized in.	2020-01-14 13:28:48 -05:00
Lioncash	f10ea944e0	gl_shader_cache: Remove unused STAGE_RESERVED_UBOS constant Given this isn't used, this can be removed entirely.	2020-01-14 13:16:52 -05:00
Lioncash	4cd5ad90f3	gl_shader_cache: std::move entries in CachedShader constructor Avoids several reallocations of std::vector instances where applicable.	2020-01-14 13:14:16 -05:00
Lioncash	15a6840e7a	gl_shader_cache: Remove unused entries variable in BuildShader() Eliminates a few unnecessary constructions of std::vectors.	2020-01-14 13:11:49 -05:00
bunnei	55f95e7f26	Merge pull request #3287 from ReinUsesLisp/ldg-stg-16 shader_ir/memory: Implement u16 and u8 for STG and LDG	2020-01-14 09:57:08 -05:00
bunnei	15788ffcde	Merge pull request #3288 from ReinUsesLisp/uncurse-aoffi shader_ir/texture: Simplify AOFFI code	2020-01-13 23:52:12 -05:00
bunnei	6985eea519	Merge pull request #3290 from ReinUsesLisp/gl-clamp maxwell_to_vk: Implement GL_CLAMP hacking Nvidia's driver	2020-01-13 19:16:06 -05:00
ReinUsesLisp	09e17fbb0f	vk_texture_cache: Implement generic texture cache on Vulkan It currently ignores PBO linearizations since these should be dropped as soon as possible on OpenGL.	2020-01-13 20:37:50 -03:00
ReinUsesLisp	2b2712fa95	texture_cache/surface_params: Make GetNumLayers public	2020-01-13 20:35:43 -03:00
Rodrigo Locatti	b1138e5ea1	vk_compute_pass: Address feedback Comment hardcoded SPIR-V modules.	2020-01-10 22:46:34 -03:00
ReinUsesLisp	3d46709b7f	maxwell_to_vk: Implement GL_CLAMP hacking Nvidia's driver Nvidia's driver defaults invalid enumerations to GL_CLAMP. Vulkan doesn't expose GL_CLAMP through its API, but we can hack it on Nvidia's driver using the internal driver defaults.	2020-01-10 17:12:50 -03:00
ReinUsesLisp	13021b534c	shader_ir/texture: Simplify AOFFI code	2020-01-09 03:50:37 -03:00
ReinUsesLisp	e2a2a556b9	shader_ir/memory: Implement u16 and u8 for STG and LDG Using the same technique we used for u8 on LDG, implement u16. In the case of STG, load memory and insert the value we want to set into it with bitfieldInsert. Then set that value.	2020-01-09 02:12:29 -03:00
ReinUsesLisp	908e085d02	vk_compute_pass: Add compute passes to emulate missing Vulkan features This currently only supports quad arrays and u8 indices. In the future we can remove quad arrays with a table written from the CPU, but this was used to bootstrap the other passes helpers and it was left in the code. The blob code is generated from the "shaders/" directory. Read the instructions there to know how to generate the SPIR-V.	2020-01-08 19:24:26 -03:00
ReinUsesLisp	82a64da077	vk_shader_util: Add helper to build SPIR-V shaders	2020-01-08 19:22:20 -03:00
ReinUsesLisp	6888d776ff	vk_pipeline_cache: Initial implementation Given a pipeline key, this cache returns a pipeline abstraction (for graphics or compute).	2020-01-06 22:02:26 -03:00
ReinUsesLisp	2effdeb924	vk_graphics_pipeline: Initial implementation This abstractio represents the state of the 3D engine at a given draw. Instead of changing individual bits of the pipeline how it's done in APIs like D3D11, OpenGL and NVN; on Vulkan we are forced to put everything together into a single, immutable object. It takes advantage of the few dynamic states Vulkan offers.	2020-01-06 22:02:26 -03:00
ReinUsesLisp	dc96a59fa0	vk_compute_pipeline: Initial implementation This abstraction represents a Vulkan compute pipeline.	2020-01-06 22:02:26 -03:00
ReinUsesLisp	b392a5986e	vk_pipeline_cache: Add file and define descriptor update template filler This function allows us to share code between compute and graphics pipelines compilation.	2020-01-06 22:02:26 -03:00
ReinUsesLisp	3142f1b597	fixed_pipeline_state: Add depth clamp	2020-01-06 22:02:26 -03:00
ReinUsesLisp	9c548146ca	vk_rasterizer: Add placeholder	2020-01-06 22:02:26 -03:00
bunnei	5be00cba15	Merge pull request #3276 from ReinUsesLisp/pipeline-reqs vk_update_descriptor/vk_renderpass_cache: Add pipeline cache dependencies	2020-01-06 17:03:34 -05:00
ReinUsesLisp	5aeff9aff5	vk_renderpass_cache: Initial implementation The renderpass cache is used to avoid creating renderpasses on each draw. The hashed structure is not currently optimized.	2020-01-06 18:28:32 -03:00
ReinUsesLisp	322d6a0311	vk_update_descriptor: Initial implementation The update descriptor is used to store in flat memory a large chunk of staging data used to update descriptor sets through templates. It provides a push interface to easily insert descriptors following the current pipeline. The order used in the descriptor update template has to be implicitly followed. We can catch bugs here using validation layers.	2020-01-06 18:28:32 -03:00
ReinUsesLisp	5b01f80a12	vk_stream_buffer/vk_buffer_cache: Avoid halting and use generic cache The stream buffer before this commit once it was full (no more bytes to write before looping) waiting for all previous operations to finish. This was a temporary solution and had a noticeable performance penalty in performance (from what a profiler showed). To avoid this mark with fences usages of the stream buffer and once it loops wait for them to be signaled. On average this will never wait. Each fence knows where its usage finishes, resulting in a non-paged stream buffer. On the other side, the buffer cache is reimplemented using the generic buffer cache. It makes use of the staging buffer pool and the new stream buffer.	2020-01-06 18:13:41 -03:00
ReinUsesLisp	ceb851b590	vk_memory_manager: Misc changes * Allocate memory in discrete exponentially increasing chunks until the 128 MiB threshold. Allocations larger thant that increase linearly by 256 MiB (depending on the required size). This allows to use small allocations for small resources. * Move memory maps to a RAII abstraction. To optimize for debugging tools (like RenderDoc) users will map/unmap on usage. If this ever becomes a noticeable overhead (from my profiling it doesn't) we can transparently move to persistent memory maps without harming the API, getting optimal performance for both gameplay and debugging. * Improve messages on exceptional situations. * Fix typos "requeriments" -> "requirements". * Small style changes.	2020-01-06 18:13:41 -03:00
ReinUsesLisp	85bb6a6f08	vk_buffer_cache: Temporarily remove buffer cache This is intended for a follow up commit to avoid circular dependencies.	2020-01-06 17:58:46 -03:00
bunnei	89fc75d769	Merge pull request #3257 from degasus/no_busy_loops video_core: Block in WaitFence.	2020-01-06 00:09:57 -05:00
Fernando Sahmkow	56e450a3f7	Merge pull request #3264 from ReinUsesLisp/vk-descriptor-pool vk_descriptor_pool: Initial implementation	2020-01-05 15:54:41 -04:00
bunnei	cd0a7dfdbc	Merge pull request #3258 from FernandoS27/shader-amend Shader_IR: add the ability to amend code in the shader ir.	2020-01-04 14:05:17 -05:00
Fernando Sahmkow	3dd6b55851	Shader_IR: Address Feedback	2020-01-04 14:40:57 -04:00
Fernando Sahmkow	a1667a7b46	Shader_IR: Implement TXD Array. This commit extends the compilation of TXD to support array samplers on TXD.	2020-01-04 13:28:02 -04:00
Rodrigo Locatti	6e347d8d1b	Update src/video_core/renderer_vulkan/vk_descriptor_pool.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com>	2020-01-03 17:34:30 -03:00
ReinUsesLisp	0d6d8129c4	yuzu: Remove Maxwell debugger This was carried from Citra and wasn't really used on yuzu. It also adds some runtime overhead. This commit removes it from yuzu's codebase.	2020-01-02 23:09:44 -03:00
bunnei	ae0e481677	Merge pull request #3243 from ReinUsesLisp/topologies maxwell_to_gl: Implement missing primitive topologies	2020-01-01 20:33:33 -05:00
ReinUsesLisp	1fe7df4517	vk_descriptor_pool: Initial implementation Create a large descriptor pool where we allocate all our descriptors from. It has to be wide enough to support any pipeline, hence its large numbers. If the descritor pool is filled, we allocate more memory at that moment. This way we can take advantage of permissive drivers like Nvidia's that allocate more descriptors than what the spec requires.	2020-01-01 16:44:06 -03:00
bunnei	028b2718ed	Merge pull request #3239 from ReinUsesLisp/p2r shader/p2r: Implement P2R Pr	2019-12-31 20:37:16 -05:00
Fernando Sahmkow	b3371ed09e	Shader_IR: add the ability to amend code in the shader ir. This commit introduces a mechanism by which shader IR code can be amended and extended. This useful for track algorithms where certain information can derived from before the track such as indexes to array samplers.	2019-12-30 15:31:48 -04:00
Fernando Sahmkow	7bd447355f	Merge pull request #3248 from ReinUsesLisp/vk-image vk_image: Add an image object abstraction	2019-12-30 14:25:14 -04:00
Rodrigo Locatti	4cbb363d3f	vk_image: Avoid unnecesary equals	2019-12-30 13:28:23 -03:00
Fernando Sahmkow	287d5921cf	Merge pull request #3249 from ReinUsesLisp/vk-staging-buffer-pool vk_staging_buffer_pool: Add a staging pool for temporary operations	2019-12-30 12:25:59 -04:00
Markus Wick	cb9dd01ffd	video_core: Block in WaitFence. This function is called rarely and blocks quite often for a long time. So don't waste power and let the CPU sleep. This might also increase the performance as the other cores might be allowed to clock higher.	2019-12-30 13:04:53 +01:00
Rodrigo Locatti	f2c61bbe13	vk_staging_buffer_pool: Initialize last epoch to zero	2019-12-29 19:19:43 -03:00
Fernando Sahmkow	f846e3d6d0	Merge pull request #3250 from ReinUsesLisp/empty-fragment gl_rasterizer: Allow rendering without fragment shader	2019-12-28 14:33:53 -04:00
bunnei	8a76f816a4	Merge pull request #3228 from ReinUsesLisp/ptp shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S	2019-12-26 21:43:44 -05:00
ReinUsesLisp	5b989f189f	gl_rasterizer: Allow rendering without fragment shader Rendering without a fragment shader is usually used in depth-only passes.	2019-12-26 16:38:49 -03:00
ReinUsesLisp	3813af2f3c	vk_staging_buffer_pool: Add a staging pool for temporary operations The job of this abstraction is to provide staging buffers for temporary operations. Think of image uploads or buffer uploads to device memory. It automatically deletes unused buffers.	2019-12-25 18:12:17 -03:00
ReinUsesLisp	c83bf7cd1e	vk_image: Add an image object abstraction This object's job is to contain an image and manage its transitions. Since Nvidia hardware doesn't know what a transition is but Vulkan requires them anyway, we have to state track image subresources individually. To avoid the overhead of tracking each subresource in images with many subresources (think of cubemap arrays with several mipmaps), this commit tracks when subresources have diverged. As long as this doesn't happen we can check the state of the first subresource (that will be shared with all subresources) and update accordingly. Image transitions are deferred to the scheduler command buffer.	2019-12-25 18:00:16 -03:00
Fernando Sahmkow	5619d24377	Merge pull request #3244 from ReinUsesLisp/vk-fps fixed_pipeline_state: Define structure and loaders	2019-12-25 14:31:29 -04:00
bunnei	4af569ee47	Merge pull request #3236 from ReinUsesLisp/rasterize-enable gl_rasterizer: Implement RASTERIZE_ENABLE	2019-12-24 22:54:10 -05:00
ReinUsesLisp	b9e3f5eb36	fixed_pipeline_state: Define symetric operator!= and mark as noexcept Marks as noexcept Hash, operator== and operator!= for consistency.	2019-12-24 18:24:08 -03:00
ReinUsesLisp	4a3026b16b	fixed_pipeline_state: Define structure and loaders The intention behind this hasheable structure is to describe the state of fixed function pipeline state that gets compiled to a single graphics pipeline state object. This is all dynamic state in OpenGL but Vulkan wants it in an immutable state, even if hardware can edit it freely. In this commit the structure is defined in an optimized state (it uses booleans, has paddings and many data entries that can be packed to single integers). This is intentional as an initial implementation that is easier to debug, implement and review. It will be optimized in later stages, or it might change if Vulkan gets more dynamic states.	2019-12-22 22:59:11 -03:00
ReinUsesLisp	5770418fb3	maxwell_3d: Add depth bounds registers	2019-12-22 22:55:06 -03:00
ReinUsesLisp	91d35559e5	maxwell_to_gl: Implement missing primitive topologies Many of these topologies are exclusively available in OpenGL.	2019-12-22 22:33:01 -03:00
bunnei	e976d0e924	Merge pull request #3241 from ReinUsesLisp/gl-shader-cache gl_shader_cache: Style changes	2019-12-22 16:23:46 -05:00
bunnei	1e76655f83	Merge pull request #3238 from ReinUsesLisp/vk-resource-manager vk_resource_manager: Catch device losses and other changes	2019-12-22 15:57:16 -05:00
bunnei	0f3ac9cfeb	Merge pull request #3203 from FernandoS27/tex-cache-fixes Texture Cache: Add HLE methods for building 3D textures	2019-12-22 14:25:13 -05:00
Fernando Sahmkow	3dc585d011	Merge pull request #3237 from ReinUsesLisp/vk-shader-decompiler vk_shader_decompiler: Misc changes	2019-12-22 12:36:56 -04:00
Fernando Sahmkow	218ee18417	Texture Cache: Improve documentation	2019-12-22 12:29:23 -04:00
Fernando Sahmkow	a3916588b6	Texture Cache: Address Feedback	2019-12-22 12:24:34 -04:00
Fernando Sahmkow	51c9e98677	Texture Cache: Add HLE methods for building 3D textures within the GPU in certain scenarios. This commit adds a series of HLE methods for handling 3D textures in general. This helps games that generate 3D textures on every frame and may reduce loading times for certain games.	2019-12-22 12:24:34 -04:00
Fernando Sahmkow	aea978e037	Merge pull request #3230 from ReinUsesLisp/vk-emu-shaders renderer_vulkan/shader: Add helper GLSL shaders	2019-12-22 11:23:09 -04:00
Fernando Sahmkow	27efcc15e9	Merge pull request #3240 from ReinUsesLisp/decomp-cond-code vk_shader_decompiler: Use Visit instead of reimplementing it	2019-12-22 11:20:55 -04:00
bunnei	16dcfacbfc	Merge pull request #3235 from ReinUsesLisp/ldg-u8 shader/memory: Implement LDG.U8 and unaligned U8 loads	2019-12-21 22:50:28 -05:00
ReinUsesLisp	1e16023d60	gl_shader_cache: Update commentary for shared memory Remove false commentary. Not dividing by 4 the size of shared memory is not a hack; it describes the number of integers, not bytes. While we are at it sort the generated code to put preprocessor lines on the top.	2019-12-20 22:51:21 -03:00
ReinUsesLisp	486c6a5316	gl_shader_cache: Remove unused entry in GetPrimitiveDescription	2019-12-20 22:49:30 -03:00
ReinUsesLisp	af93909c9c	vk_shader_decompiler: Use Visit instead of reimplementing it ExprCondCode visit implements the generic Visit. Use this instead of that one. As an intended side effect this fixes unwritten memory usages in cases when a negation of a condition code is used.	2019-12-20 21:36:25 -03:00
ReinUsesLisp	38d3a48873	shader/p2r: Implement P2R Pr P2R dumps predicate or condition codes state to a register. This is useful for unit testing.	2019-12-20 18:02:41 -03:00
ReinUsesLisp	cf27b59493	shader/r2p: Refactor P2R to support P2R	2019-12-20 17:55:42 -03:00
bunnei	7be65c6a68	Merge pull request #3234 from ReinUsesLisp/i2f-u8-selector shader/conversion: Implement byte selector in I2F	2019-12-19 22:36:26 -05:00
bunnei	6d55b14cc0	Merge pull request #3233 from ReinUsesLisp/mismatch-sizes shader/texture: Properly shrink unused entries in size mismatches	2019-12-19 20:40:27 -05:00
ReinUsesLisp	e41da22c8d	vk_resource_manager: Add entry to VKFence to test its usage	2019-12-19 16:31:34 -03:00
ReinUsesLisp	ec983a2451	vk_reosurce_manager: Add assert for releasing fences Notify the programmer when a request to release a fence is invalid because the fence is already free.	2019-12-19 16:31:34 -03:00
ReinUsesLisp	6ddffa010a	vk_resource_manager: Implement VKFenceWatch move constructor This allows us to put VKFenceWatch inside a std::vector without storing it in heap. On move we have to signal the fences where the new protected resource is, adding some overhead.	2019-12-19 16:31:34 -03:00
ReinUsesLisp	54747d60bc	vk_device: Add entry to catch device losses VK_NV_device_diagnostic_checkpoints allows us to push data to a Vulkan queue and then query it even after a device loss. This allows us to push the current pipeline object and see what was the call that killed the device.	2019-12-19 16:31:33 -03:00
ReinUsesLisp	2a63b3bdb9	vk_shader_decompiler: Fix full decompilation When full decompilation was enabled, labels were not being inserted and instructions were misused. Fix these bugs.	2019-12-19 16:24:45 -03:00
ReinUsesLisp	de918ebeb0	vk_shader_decompiler: Skip NDC correction when it is native Avoid changing gl_Position when the NDC used by the game is [0, 1] (Vulkan's native).	2019-12-19 16:24:45 -03:00
ReinUsesLisp	485c21eac3	vk_shader_decompiler: Normalize output fragment attachments Some games write from fragment shaders to an unexistant framebuffer attachment or they don't write to one when it exists in the framebuffer. Fix this by skipping writes or adding zeroes.	2019-12-19 16:24:45 -03:00
bunnei	1eb4a95d2b	Merge pull request #3232 from ReinUsesLisp/gl-decompiler-images gl_shader_decompiler: Add missing DeclareImages	2019-12-19 11:32:47 -05:00
bunnei	253aa52351	Merge pull request #3231 from ReinUsesLisp/tld4s-encoding shader_bytecode: Fix TLD4S encoding	2019-12-19 11:32:25 -05:00
ReinUsesLisp	f4a25f854c	vk_device: Add query for RGBA8Uint	2019-12-19 02:08:29 -03:00
ReinUsesLisp	abb33d4aec	vk_shader_decompiler: Update sirit and implement Texture AOFFI	2019-12-19 01:42:13 -03:00
bunnei	d53cf05513	Merge pull request #3221 from ReinUsesLisp/vk-scheduler vk_scheduler: Delegate commands to a worker thread and state track	2019-12-18 22:04:08 -05:00
ReinUsesLisp	da0aa4da6b	gl_rasterizer: Implement RASTERIZE_ENABLE RASTERIZE_ENABLE is the opposite of GL_RASTERIZER_DISCARD. Implement it naturally using this. NVN games expect rasterize to be enabled by default, reflect that in our initial GPU state.	2019-12-18 19:28:23 -03:00
ReinUsesLisp	ae8d4b6c0c	shader/memory: Implement LDG.U8 and unaligned U8 loads LDG can load single bytes instead of full integers or packs of integers. These have the advantage of loading bytes that are not aligned to 4 bytes. To emulate these this commit gets the byte being referenced (by doing "address & 3" and then using that to extract the byte from the loaded integer: result = bitfieldExtract(loaded_integer, (address % 4) * 8, 8)	2019-12-18 01:21:46 -03:00
ReinUsesLisp	a7d6bd1ef1	shader/conversion: Implement byte selector in I2F I2F's byte selector is used to choose what bytes to convert to float. e.g. if the input is 0xaabbccdd and the selector is ".B3" it will convert 0xaa. The default (when it's not shown in nvdisasm) is ".B0", in that example the default would convert 0xdd to float.	2019-12-18 00:41:22 -03:00
ReinUsesLisp	15a753b9a5	shader/texture: Properly shrink unused entries in size mismatches When a image format mismatches we were inserting zeroes to the texture itself. This was not handling cases were the mismatch uses less coordinates than the guest shader code. Address that by resizing the vector.	2019-12-17 23:38:10 -03:00
ReinUsesLisp	e438079b50	gl_shader_decompiler: Add missing DeclareImages	2019-12-17 23:34:15 -03:00
ReinUsesLisp	8b26b4228b	shader_bytecode: Fix TLD4S encoding	2019-12-17 23:32:10 -03:00
ReinUsesLisp	b52297767e	renderer_vulkan/shader: Add helper GLSL shaders These shaders are used to specify code that is not dynamically generated in the Vulkan backend. Instead of packing it inside the build system, it's manually built and copied to the C++ file to avoid adding unnecessary build time dependencies. quad_array should be dropped in the future since it can be emulated with a memory pool generated from the CPU.	2019-12-16 17:59:08 -03:00
bunnei	65b1b05e05	Merge pull request #3182 from ReinUsesLisp/renderer-opengl renderer_opengl: Miscellaneous clean ups	2019-12-16 13:01:04 -05:00
ReinUsesLisp	e09c1fbc1f	shader/texture: Implement TLD4.PTP	2019-12-16 04:09:24 -03:00
ReinUsesLisp	844e4a297b	shader/texture: Enable arrayed TLD4	2019-12-16 02:37:21 -03:00
ReinUsesLisp	a87c85eba2	gl_shader_decompiler: Rename "sepparate" to "separate"	2019-12-16 02:12:51 -03:00
ReinUsesLisp	3d2c44848b	shader/texture: Implement AOFFI for TLD4S	2019-12-16 02:06:42 -03:00
ReinUsesLisp	3d9fff82c0	shader/texture: Remove unnecesary parenthesis	2019-12-16 01:52:33 -03:00
Rodrigo Locatti	eac075692b	Merge pull request #3219 from FernandoS27/fix-bindless Corrections and fixes to TLD4S & bindless samplers failing	2019-12-16 01:26:11 -03:00
bunnei	3d51153611	Merge pull request #3222 from ReinUsesLisp/maxwell-to-vk maxwell_to_vk: Use VK_EXT_index_type_uint8 and misc changes	2019-12-14 22:30:12 -05:00
bunnei	035ec7d9de	Merge pull request #3213 from ReinUsesLisp/intel-mesa gl_device: Enable compute shaders for Intel Mesa drivers	2019-12-14 16:04:31 -05:00
bunnei	2b650543c6	Merge pull request #3212 from ReinUsesLisp/fix-smem-lmem gl_shader_cache: Add missing new-line on emitted GLSL	2019-12-13 21:35:29 -05:00
ReinUsesLisp	e3ea583893	maxwell_to_vk: Improve image format table and add more formats A1B5G5R5 uses A1R5G5B5. This is flipped with image view swizzles; flushing is still not properly implemented on Vulkan for this particular format.	2019-12-13 03:12:29 -03:00
ReinUsesLisp	f27b21077d	maxwell_to_vk: Implement more vertex formats	2019-12-13 03:12:28 -03:00
ReinUsesLisp	8db8631d81	maxwell_to_vk: Implement more primitive topologies Add an extra argument to query device capabilities in the future. The intention behind this is to use native quads, quad strips, line loops and polygons if these are released for Vulkan.	2019-12-13 03:12:28 -03:00
ReinUsesLisp	15513f0801	maxwell_to_vk: Approach GL_CLAMP closer to the GL spec The OpenGL spec defines GL_CLAMP's formula similarly to CLAMP_TO_EDGE and CLAMP_TO_BORDER depending on the filter mode used. It doesn't exactly behave like this, but it's the closest we can get with what Vulkan offers without emulating it by injecting shader code.	2019-12-13 03:12:28 -03:00
ReinUsesLisp	f845df8651	maxwell_to_vk: Use VK_EXT_index_type_uint8 when available	2019-12-13 02:37:23 -03:00
ReinUsesLisp	2df9a2dcaf	vk_scheduler: Delegate commands to a worker thread and state track Introduce a worker thread approach for delegating Vulkan work derived from dxvk's approach. https://github.com/doitsujin/dxvk Now that the scheduler is what handles all Vulkan work related to command streaming, store state tracking in itself. This way we can know when to reupload Vulkan dynamic state to the queue (since this one is invalidated between command buffers unlike NVN). We can also store the renderpass state and graphics pipeline bound to avoid redundant binds and renderpass begins/ends.	2019-12-13 02:24:48 -03:00
bunnei	8fc49a83b6	Merge pull request #3217 from jhol/fix-boost-include Added missing include	2019-12-11 22:21:24 -05:00
Fernando Sahmkow	c0ee0aa1a8	Shader_IR: Correct TLD4S Depth Compare.	2019-12-11 19:53:17 -04:00
Fernando Sahmkow	af89723fa3	Shader_Ir: Correct TLD4S encoding and implement f16 flag.	2019-12-11 19:53:17 -04:00
Fernando Sahmkow	84a158c977	Gl_Shader_compiler: Correct Depth Compare for Texture Gather operations.	2019-12-11 19:53:16 -04:00
Fernando Sahmkow	271a3264f3	Shader_Ir: default failed tracks on bindless samplers to null values.	2019-12-11 19:53:16 -04:00
Fernando Sahmkow	1d2ba3cc97	Gl_Rasterizer: Skip Tesselation Control and Eval stages as they are un implemented. This commit ensures the OGL backend does not execute tesselation shader stages as they are currently unimplemented.	2019-12-11 15:41:26 -04:00
bunnei	1a66cde175	Merge pull request #3210 from ReinUsesLisp/memory-barrier shader: Implement MEMBAR.GL	2019-12-11 14:24:39 -05:00
Joel Holdsworth	e9faa1617c	Added missing include	2019-12-11 18:11:49 +00:00
ReinUsesLisp	f564eaebed	gl_device: Enable compute shaders for Intel Mesa drivers Previously we naively checked for "Intel" in GL_VENDOR, but this includes both Intel's proprietary driver and the mesa driver. Re-enable compute shaders for mesa.	2019-12-11 00:00:30 -03:00
ReinUsesLisp	48e16c4c49	gl_shader_cache: Add missing new-line on emitted GLSL Add missing new-line. This caused shaders using local memory and shared memory to inject a preprocessor GLSL line after an expression (resulting in invalid code). It looked like this: shared uint smem[8];#define LOCAL_MEMORY_SIZE 16 It should look like this (addressed by this commit): shared uint smem[8]; \#define LOCAL_MEMORY_SIZE 16	2019-12-10 23:52:51 -03:00
Fernando Sahmkow	7ffb672f61	Maxwell3D: Implement Depth Mode. This commit finishes adding depth mode that was reverted before due to other unresolved issues.	2019-12-10 19:51:46 -04:00
ReinUsesLisp	425a254fa2	shader: Implement MEMBAR.GL Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.	2019-12-10 16:45:03 -03:00
ReinUsesLisp	233ed96a5c	vk_shader_decompiler: Fix build issues on old gcc versions	2019-12-10 01:55:38 -03:00
ReinUsesLisp	d30cf51d7d	vk_shader_decompiler: Reduce YNegate's severity	2019-12-09 23:52:28 -03:00
ReinUsesLisp	0b5b93053d	shader_ir/other: Implement S2R InvocationId	2019-12-09 23:52:28 -03:00
ReinUsesLisp	ecbfa416f0	vk_shader_decompiler: Misc changes Update Sirit and its usage in vk_shader_decompiler. Highlights: - Implement tessellation shaders - Implement geometry shaders - Implement some missing features - Use native half float instructions when available.	2019-12-09 23:51:57 -03:00
ReinUsesLisp	9ad6327fbd	shader: Keep track of shaders using warp instructions	2019-12-09 23:40:41 -03:00
ReinUsesLisp	6233b1db08	shader_ir/memory: Implement patch stores	2019-12-09 23:25:21 -03:00
ReinUsesLisp	19ce0d4f1a	vk_device: Misc changes - Setup more features and requirements. - Improve logging for missing features. - Collect telemetry parameters. - Add queries for more image formats. - Query push constants limits. - Optionally enable some extensions.	2019-12-09 01:04:48 -03:00
bunnei	faf5ae6a50	Merge pull request #3198 from ReinUsesLisp/tessellation-maxwell maxwell_3d: Add tessellation state entries	2019-12-08 22:28:25 -05:00
ReinUsesLisp	7ea362e134	externals: Update Vulkan-Headers	2019-12-08 22:08:19 -03:00
ReinUsesLisp	f632d00eb1	vk_swapchain: Add support for swapping sRGB We don't know until the game is running if it's using an sRGB color space or not. Add support for hot-swapping swapchain surface formats.	2019-12-06 22:42:08 -03:00
ReinUsesLisp	36651f215a	maxwell_3d: Add tessellation tess level registers	2019-12-06 22:08:22 -03:00
ReinUsesLisp	707bf41c6f	maxwell_3d: Add tessellation mode register	2019-12-06 22:07:31 -03:00
ReinUsesLisp	d2b50c5ebd	maxwell_3d: Add patch vertices register	2019-12-06 22:06:53 -03:00
ReinUsesLisp	74f515e8b6	shader_bytecode: Remove corrupted character	2019-12-06 20:31:56 -03:00
bunnei	e36814d6d5	Merge pull request #3109 from FernandoS27/new-instr Implement FLO & TXD Instructions on GPU Shaders	2019-12-06 18:18:16 -05:00
bunnei	3c1b6b5723	Merge pull request #2987 from FernandoS27/texture-invalid Texture_Cache: Redo invalid Surfaces handling.	2019-12-02 12:07:05 -05:00
bunnei	930b7c18a6	Merge pull request #3184 from ReinUsesLisp/framebuffer-cache gl_framebuffer_cache: Optimize framebuffer cache management	2019-11-30 18:46:40 -05:00
ReinUsesLisp	ff64c3951a	texture_cache/surface_base: Fix out of bounds texture views Some texture views were being created out of bounds (with more layers or mipmaps than what the original texture has). This is because of a miscalculation in mipmap bounding. end_layer and end_mipmap are out of bounds (e.g. layer 6 in a cubemap), there's no need to add one more there. Fixes OpenGL errors and Vulkan crashes on Splatoon 2.	2019-11-29 16:51:14 -03:00
ReinUsesLisp	fb6cf12a17	gl_framebuffer_cache: Optimize framebuffer key Pack color attachment enumerations into a single u32. To determine the number of buffers, the highest color attachment with a shared pointer that doesn't point to null is used.	2019-11-28 23:02:20 -03:00
ReinUsesLisp	c34da106ed	gl_rasterizer: Re-enable framebuffer cache for clear buffers	2019-11-28 23:02:20 -03:00
ReinUsesLisp	e6a0a30334	renderer_opengl: Make ScreenRectVertex's constructor constexpr	2019-11-28 20:36:02 -03:00
ReinUsesLisp	dee7844443	renderer_opengl: Remove C casts	2019-11-28 20:28:27 -03:00
ReinUsesLisp	3a44faff11	renderer_opengl: Use explicit binding for presentation shaders	2019-11-28 20:25:56 -03:00
ReinUsesLisp	75cc501d52	renderer_opengl: Drop macros for message decorations	2019-11-28 20:15:25 -03:00
ReinUsesLisp	056f049b26	renderer_opengl: Move static definitions to anonymous namespace	2019-11-28 20:14:40 -03:00
ReinUsesLisp	4589582eaf	renderer_opengl: Move commentaries to header file	2019-11-28 20:11:03 -03:00
bunnei	e3ee017e91	Merge pull request #3169 from lioncash/memory core/memory: Deglobalize memory management code	2019-11-28 11:43:17 -05:00
Rodrigo Locatti	913d0bb269	Merge pull request #3174 from lioncash/optional video_core/gpu_thread: Tidy up SwapBuffers()	2019-11-27 20:35:31 -03:00
Lioncash	aed6d8bef5	video_core/gpu_thread: Tidy up SwapBuffers() We can just use std::nullopt and std::make_optional to make this a little bit less noisy.	2019-11-27 17:46:11 -05:00
Lioncash	9403979c22	video_core/const_buffer_locker: Make use of std::tie in HasEqualKeys() Tidies it up a little bit visually.	2019-11-27 05:53:43 -05:00
Lioncash	930e311526	video_core/const_buffer_locker: Remove unused includes	2019-11-27 05:51:13 -05:00
Lioncash	9341ca7979	video_core/const_buffer_locker: Remove #pragma once from cpp file Silences a compiler warning.	2019-11-27 05:50:51 -05:00
Lioncash	849581075a	core/memory: Migrate over RasterizerMarkRegionCached() to the Memory class This is only used within the accelerated rasterizer in two places, so this is also a very trivial migration.	2019-11-26 21:55:38 -05:00
Lioncash	3f08e8d8d4	core/memory: Migrate over GetPointer() With all of the interfaces ready for migration, it's trivial to migrate over GetPointer().	2019-11-26 21:55:38 -05:00
Lioncash	536fc7f0ea	core: Prepare various classes for memory read/write migration Amends a few interfaces to be able to handle the migration over to the new Memory class by passing the class by reference as a function parameter where necessary. Notably, within the filesystem services, this eliminates two ReadBlock() calls by using the helper functions of HLERequestContext to do that for us.	2019-11-26 21:55:37 -05:00
bunnei	6df6caaf5f	Merge pull request #3143 from ReinUsesLisp/indexing-bug gl_device: Deduce indexing bug from device instead of heuristic	2019-11-26 21:53:12 -05:00
ReinUsesLisp	ef4446cb11	gl_shader_decompiler: Fix casts from fp32 to f16 Casts from f32 to f16 zeroes the higher half of the target register.	2019-11-25 22:22:33 -03:00
ReinUsesLisp	410d44ce05	gl_device: Deduce indexing bug from device instead of heuristic The heuristic to detect AMD's driver was not working properly since it also included Intel. Instead of using heuristics to detect it, compare the GL_VENDOR string.	2019-11-25 16:15:22 -03:00
bunnei	2899c93818	Merge pull request #3158 from ReinUsesLisp/srgb-blit gl_texture_cache: Apply sRGB on blits	2019-11-24 20:47:13 -05:00
bunnei	33a6b45a6c	Merge pull request #3155 from bunnei/fix-asynch-gpu-wait gpu_thread: Don't spin wait if there are no GPU commands.	2019-11-24 20:19:25 -05:00
bunnei	b03242067d	Merge pull request #3098 from ReinUsesLisp/shader-invalidations gl_shader_cache: Miscellaneous changes to shaders	2019-11-24 19:36:30 -05:00
ReinUsesLisp	74fff717aa	gl_texture_cache: Apply sRGB on blits glBlitFramebuffer keeps in mind GL_FRAMEBUFFER_SRGB's state. Enable this depending on the target surface pixel format.	2019-11-24 18:13:33 -03:00
bunnei	b7031b2b9d	Merge pull request #3105 from ReinUsesLisp/fix-stencil-reg maxwell_3d: Fix stencil_back_func_mask offset	2019-11-24 13:53:23 -05:00
bunnei	e81e0036b4	Merge pull request #3145 from ReinUsesLisp/buffer-cache-init buffer_cache: Remove brace initialized for objects with default constructor	2019-11-24 02:55:02 -05:00
bunnei	9ec84fc592	gpu_thread: Don't spin wait if there are no GPU commands.	2019-11-23 15:17:28 -05:00
bunnei	4ed183ee42	Merge pull request #3141 from ReinUsesLisp/gl-position gl_shader_gen: Apply default value to gl_Position	2019-11-23 13:23:46 -05:00
ReinUsesLisp	dc2e83fa31	gl_device: Reserve base bindings on limited devices SSBOs and other resources are limited per pipeline on Intel and AMD. Heuristically reserve resources per stage having in mind the reported OpenGL limits.	2019-11-22 21:28:50 -03:00
ReinUsesLisp	e3d7334be9	gl_state: Skip null texture binds glBindTextureUnit doesn't support null textures. Skip binding these.	2019-11-22 21:28:50 -03:00
ReinUsesLisp	919ac2c4d3	gl_rasterizer: Disable compute shaders on Intel Intel's proprietary driver enters in a corrupt state when compute shaders are executed. For now, disable these.	2019-11-22 21:28:50 -03:00
ReinUsesLisp	894ad74b87	gl_shader_cache: Hack shared memory size The current shared memory size seems to be smaller than what the game actually uses. This makes Nvidia's driver consistently blow up; in the case of FE3H it made it explode on Qt's SwapBuffers while SDL2 worked just fine. For now keep this hack since it's still progress over the previous hardcoded shared memory size.	2019-11-22 21:28:49 -03:00
ReinUsesLisp	e35b9597ef	gl_shader_decompiler: Normalize image bindings	2019-11-22 21:28:49 -03:00
ReinUsesLisp	36d9b409fc	gl_shader_decompiler: Normalize cbuf bindings Stage and compute shaders were using a different binding counter. Normalize these.	2019-11-22 21:28:49 -03:00
ReinUsesLisp	f936b86c7c	gl_rasterizer: Add missing cbuf counter reset on compute	2019-11-22 21:28:49 -03:00
ReinUsesLisp	180417c514	gl_shader_cache: Remove dynamic BaseBinding specialization	2019-11-22 21:28:49 -03:00
ReinUsesLisp	c8a48aacc0	video_core: Unify ProgramType and ShaderStage into ShaderType	2019-11-22 21:28:48 -03:00
ReinUsesLisp	0f23359a44	gl_rasterizer: Bind graphics images to draw commands Images were not being bound to draw invocations because these would require a cache invalidation.	2019-11-22 21:28:48 -03:00
ReinUsesLisp	287ae2b9e8	gl_shader_cache: Specialize local memory size for compute shaders Local memory size in compute shaders was stubbed with an arbitary size. This commit specializes local memory size from guest GPU parameters.	2019-11-22 21:28:48 -03:00
ReinUsesLisp	dbeb523879	gl_shader_cache: Specialize shared memory size Shared memory was being declared with an undefined size. Specialize from guest GPU parameters the compute shader's shared memory size.	2019-11-22 21:28:47 -03:00
ReinUsesLisp	4f5d8e4342	gl_shader_cache: Specialize shader workgroup Drop the usage of ARB_compute_variable_group_size and specialize compute shaders instead. This permits compute to run on AMD and Intel proprietary drivers.	2019-11-22 21:28:47 -03:00
ReinUsesLisp	dc9961f341	shader/texture: Handle TLDS texture type mismatches Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets used by instructions of 1D textures. To handle the discrepancy this commit uses the the texture type from the binding and modifies the emitted code IR to build a valid backend expression. E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples a 2D texture in the coordinate ivec2(X, 0).	2019-11-22 21:28:47 -03:00
ReinUsesLisp	32c1bc6a67	shader/texture: Deduce texture buffers from locker Instead of specializing shaders to separate texture buffers from 1D textures, use the locker to deduce them while they are being decoded.	2019-11-22 21:28:47 -03:00
ReinUsesLisp	73aaf365e7	buffer_cache: Remove brace initialized for objects with default constructor	2019-11-20 16:00:40 -03:00
Fernando Sahmkow	cc81c0ce64	Texture_Cache: Redo invalid Surfaces handling. This commit aims to redo the full setup of invalid textures and guarantee correct behavior across backends in the case of finding one by using black dummy textures that match the target of the expected texture.	2019-11-20 14:59:35 -04:00
ReinUsesLisp	24f4198cee	shader/other: Reduce DEPBAR log severity While DEPBAR is stubbed it doesn't change anything from our end. Shading languages handle what this instruction does implicitly. We are not getting anything out fo this log except noise.	2019-11-19 21:26:40 -03:00
ReinUsesLisp	bc10714dcf	gl_shader_gen: Apply default value to gl_Position Nvidia has sane default output values for varyings, but the other vendors don't apply these. To properly emulate this we would have to analyze the shader header. For the time being, apply the same default Nvidia applies so we get the same behaviour on non-Nvidia drivers.	2019-11-19 20:32:01 -03:00
bunnei	b0819e2ffb	Merge pull request #3086 from ReinUsesLisp/format-lookups texture_cache: Use a flat table instead of switch for texture format lookups	2019-11-19 18:29:17 -05:00
Fernando Sahmkow	c8473f399e	Shader_IR: Address Feedback	2019-11-18 07:34:34 -04:00
bunnei	a8295d2c53	Merge pull request #3047 from ReinUsesLisp/clip-control gl_rasterizer: Emulate viewport flipping with ARB_clip_control	2019-11-15 12:09:19 -05:00
ReinUsesLisp	4681381a34	format_lookup_table: Address feedback format_lookup_table: Drop bitfields format_lookup_table: Use std::array for definition table format_lookup_table: Include <limits> instead of <numeric>	2019-11-14 20:57:30 -03:00
ReinUsesLisp	80eacdf89b	texture_cache: Use a table instead of switch for texture formats Use a large flat array to look up texture formats. This allows us to properly implement formats with different component types. It should also be faster.	2019-11-14 20:57:10 -03:00
ReinUsesLisp	48a1687f51	texture_cache: Drop abstracted ComponentType Abstracted ComponentType was not being used in a meaningful way. This commit drops its usage. There is one place where it was being used to test compatibility between two cached surfaces, but this one is implied in the pixel format. Removing the component type test doesn't change the behaviour.	2019-11-14 18:21:42 -03:00
greggameplayer	c6bc13d0aa	correct the implementation of RGBA16UI	2019-11-14 21:37:39 +01:00
Fernando Sahmkow	cd0f5dfc17	Shader_IR: Implement TXD instruction.	2019-11-14 11:15:27 -04:00
Fernando Sahmkow	f3d1b370aa	Shader_IR: Implement FLO instruction.	2019-11-14 11:15:27 -04:00
Fernando Sahmkow	95137a04e1	Shader_Bytecode: Add encodings for FLO, SHF and TXD	2019-11-14 11:15:26 -04:00
Fernando Sahmkow	b6f6733131	Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles shader: Implement FSWZADD and reimplement SHFL	2019-11-14 10:27:27 -04:00
ReinUsesLisp	7990220df7	maxwell_3d: Fix stencil_back_func_mask offset stencil_back_func_mask and stencil_back_mask were misplaced. This commit addresses that issue.	2019-11-13 16:35:17 -03:00
Rodrigo Locatti	cf770a68a5	Merge pull request #3084 from ReinUsesLisp/cast-warnings video_core: Treat implicit conversions as errors	2019-11-13 02:16:22 -03:00
Rodrigo Locatti	fb9418798d	video_core: Enable sign conversion warnings Enable sign conversion warnings but don't treat them as errors.	2019-11-11 18:00:37 -03:00
bunnei	0fc596de6e	Merge pull request #3082 from ReinUsesLisp/fix-lockers gl_shader_cache: Fix locker constructors	2019-11-09 13:58:36 -05:00
ReinUsesLisp	18c1cb68fd	video_core: Treat implicit conversions as errors	2019-11-08 22:49:39 +00:00
ReinUsesLisp	096f339a2a	video_core: Silence implicit conversion warnings	2019-11-08 22:48:50 +00:00
bunnei	a056d8de16	Merge pull request #3080 from FernandoS27/glsl-fix GLSLDecompiler: Correct Texture Gather Offset.	2019-11-08 15:56:29 -05:00
ReinUsesLisp	bfa973a62b	gl_shader_cache: Fix locker constructors Properly pass engine when a shader is being constructed from memory.	2019-11-07 20:43:31 -03:00
ReinUsesLisp	3ab0514698	gl_shader_cache: Enable extensions only when available Silence GLSL compilation warnings.	2019-11-07 20:08:42 -03:00
ReinUsesLisp	cd66395944	gl_shader_decompiler: Add safe fallbacks when ARB_shader_ballot is not available	2019-11-07 20:08:42 -03:00
ReinUsesLisp	56e237d1f9	shader_ir/warp: Implement FSWZADD	2019-11-07 20:08:41 -03:00
ReinUsesLisp	08b2b1080a	gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics	2019-11-07 20:08:41 -03:00
Fernando Sahmkow	3d7c284e0f	GLSLDecompiler: Correct Texture Gather Offset. This commit corrects the argument ordering in textureGatherOffset.	2019-11-07 11:43:56 -04:00
bunnei	b6ae48966d	Merge pull request #3032 from ReinUsesLisp/simplify-control-flow-brx shader/control_flow: Abstract repeated code chunks in BRX tracking	2019-11-07 01:30:01 -05:00
Morph	0e8a3bf3e5	buffer_cache: Add missing includes (#3079 ) `boost::make_iterator_range` is available when `boost/range/iterator_range.hpp` is included. Also include `boost/icl/interval_map.hpp` and `boost/icl/interval_set.hpp`.	2019-11-07 06:25:53 +00:00
bunnei	344d15f61e	Merge pull request #3070 from ReinUsesLisp/shader-warnings shader_ir: Reduce severity of warnings	2019-11-07 00:47:24 -05:00
ReinUsesLisp	e9d2fad984	gl_rasterizer: Remove front facing hack	2019-11-07 01:52:18 -03:00
ReinUsesLisp	f1facaeaef	gl_shader_decompiler: Fix typo "y_negate"->"y_direction"	2019-11-07 01:52:18 -03:00
ReinUsesLisp	e2ea0c3e11	gl_shader_manager: Remove unused variable in SetFromRegs	2019-11-07 01:52:18 -03:00
ReinUsesLisp	f019817f8f	gl_rasterizer: Emulate viewport flipping with ARB_clip_control Emulates negative y viewports with ARB_clip_control. This allows us to more easily emulated pipelines with tessellation and/or geometry shader stages. It also avoids corrupting games with transform feedbacks and negative viewports (gl_Position.y was being modified).	2019-11-07 01:52:18 -03:00
Rodrigo Locatti	ff5a0f370c	shader/control_flow: Specify constness on caller lambdas Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com>	2019-11-07 01:44:09 -03:00
ReinUsesLisp	7b069252f8	shader/control_flow: Use callable template instead of std::function	2019-11-07 01:44:08 -03:00
ReinUsesLisp	46c3047283	shader/control_flow: Abstract repeated code chunks in BRX tracking Remove copied and pasted for cycles into a common templated function.	2019-11-07 01:44:08 -03:00
ReinUsesLisp	ae7dfa93be	shader/control_flow: Silence Intellisense cast warnings	2019-11-07 01:44:08 -03:00
ReinUsesLisp	deb1b54eed	shader/control_flow: Remove brace initializer in std containers These containers have a default constructor.	2019-11-07 01:44:08 -03:00
ReinUsesLisp	39c66abd91	shader/decode: Reduce severity of arithmetic rounding warnings	2019-11-07 01:43:38 -03:00
ReinUsesLisp	c4374d0d41	shader/arithmetic: Reduce RRO stub severity	2019-11-07 01:43:38 -03:00
ReinUsesLisp	35d40b74b3	shader/texture: Remove NODEP warnings These warnings don't offer meaningful information while decoding shaders. Remove them.	2019-11-07 01:43:38 -03:00
bunnei	468576284d	Merge pull request #3057 from ReinUsesLisp/buffer-sub-data gl_rasterizer: Upload constant buffers with glNamedBufferSubData	2019-11-06 10:08:55 -05:00
Rodrigo Locatti	654b77d2ec	Merge pull request #3039 from ReinUsesLisp/cleanup-samplers shader/node: Unpack bindless texture encoding	2019-11-06 04:54:11 +00:00
bunnei	21e07df7b7	Merge pull request #2914 from FernandoS27/fermi-fix Fermi2D: limit blit area to only available area	2019-11-05 20:45:24 -05:00
bunnei	1bdae0fe29	common_func: Use std::array for INSERT_PADDING_* macros. - Zero initialization here is useful for determinism.	2019-11-03 22:22:41 -05:00
ReinUsesLisp	442a1cc021	gl_rasterizer: Re-enable stream buffer memory due to global memory Global memory is still using the stream buffer when it shouldn't. As a temporary fix re-enable the stream buffer on compute.	2019-11-02 13:19:19 -03:00
ReinUsesLisp	76ca2a5f82	gl_rasterizer: Upload constant buffers with glNamedBufferSubData Nvidia's OpenGL driver maps gl(Named)BufferSubData with some requirements to a fast. This path has an extra memcpy but updates the buffer without orphaning or waiting for previous calls. It can be seen as a better model for "push constants" that can upload a whole UBO instead of 256 bytes. This path has some requirements established here: http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf#page=24 Instead of using the stream buffer, this commits moves constant buffers uploads to calls of glNamedBufferSubData and from my testing it brings a performance improvement. This is disabled when the vendor is not Nvidia since it brings performance regressions.	2019-11-02 05:05:34 -03:00
Fernando Sahmkow	23cabc98db	Shader_IR: Fix regression on TLD4 Originally on the last commit I thought TLD4 acted the same as TLD4S and didn't have a mask. It actually does have a component mask. This commit corrects that.	2019-10-30 21:14:57 -04:00
Rodrigo Locatti	658489ebf7	Merge pull request #3050 from FernandoS27/fix-tld4 shader_ir: Fix TLD4 and add bindless variant	2019-10-30 18:37:17 +00:00
Fernando Sahmkow	9293c3a0f2	Shader_IR: Fix TLD4 and add Bindless Variant. This commit fixes an issue where not all 4 results of tld4 were being written, the color component was defaulted to red, among other things. It also implements the bindless variant.	2019-10-30 12:02:03 -04:00
bunnei	2382bbe3ac	Merge pull request #3046 from ReinUsesLisp/clean-gl-state gl_state: Miscellaneous clean up	2019-10-29 22:50:04 -04:00
bunnei	b5138f3c35	Merge pull request #3035 from ReinUsesLisp/rasterizer-accelerated rasterizer_accelerated: Add intermediary for GPU rasterizers	2019-10-29 22:06:41 -04:00
Rodrigo Locatti	3d0cde6a75	gl_state: Use std::array::fill instead of std::fill Co-Authored-By: Mat M. <mathew1800@gmail.com>	2019-10-30 01:30:31 +00:00
ReinUsesLisp	ce20ed8e4e	gl_state: Move dirty checks to individual apply calls instead of Apply This requires removing constness from some methods, but for consistency it's removed in all methods.	2019-10-29 21:27:25 -03:00
ReinUsesLisp	3c6557c235	gl_state: Remove ApplyDefaultState OpenGL has defaults values we can trust. Remove these.	2019-10-29 21:27:25 -03:00
ReinUsesLisp	d3651b0b82	gl_state: Change SetDefaultViewports to use default constructor	2019-10-29 21:27:24 -03:00
ReinUsesLisp	c7698d0bc8	gl_state: Minor style changes	2019-10-29 21:27:24 -03:00
ReinUsesLisp	a14d202ac2	gl_state: Remove unused Citra TextureUnits	2019-10-29 21:27:24 -03:00
ReinUsesLisp	28fece8e9b	gl_state: Move initializers from constructor to class declaration	2019-10-29 21:27:23 -03:00
ReinUsesLisp	a993df1ee2	shader/node: Unpack bindless texture encoding Bindless textures were using u64 to pack the buffer and offset from where they come from. Drop this in favor of separated entries in the struct. Remove the usage of std::set in favor of std::list (it's not std::vector to avoid reference invalidations) for samplers and images.	2019-10-29 20:53:48 -03:00
Rodrigo Locatti	2ec5b55ee3	Merge pull request #3004 from ReinUsesLisp/maxwell3d-cleanup maxwell_3d: Remove unused entries	2019-10-29 23:46:33 +00:00
Rodrigo Locatti	c5d9589942	Merge pull request #3037 from FernandoS27/new-formats video_core: Implement texture format E5B9G9R9_SHAREDEXP.	2019-10-28 01:36:58 -03:00
ReinUsesLisp	fa31e5b868	maxwell_3d/kepler_compute: Remove unused arguments in GetTexture	2019-10-28 00:23:42 -03:00
ReinUsesLisp	538ddd220e	video_core/textures: Remove unused index entry in FullTextureInfo	2019-10-28 00:14:38 -03:00
ReinUsesLisp	961fe4d19b	maxwell_3d: Remove unused method GetStageTextures	2019-10-28 00:14:29 -03:00
Fernando Sahmkow	3f9262195b	Video_Core: Implement texture format E5B9G9R9_SHAREDEXP. This commit implements the E5B9G9R9 Texture format into the general system and OpenGL backend.	2019-10-27 16:44:09 -04:00
bunnei	6909b2f0f9	Merge pull request #3034 from ReinUsesLisp/w4244-maxwell3d maxwell_3d: Silence implicit conversion warnings	2019-10-27 15:08:59 -04:00
ReinUsesLisp	3e469cecc1	maxwell_3d: Silence implicit conversion warnings While we are at it, unify types for dirty reg pointers.	2019-10-27 15:22:17 -03:00
ReinUsesLisp	bd2aff3e26	rasterizer_accelerated: Add intermediary for GPU rasterizers Add an intermediary class that implements common functions across GPU accelerated rasterizers. This avoids code repetition on different backends.	2019-10-27 03:40:08 -03:00
ReinUsesLisp	a5aa1bb174	astc: Silence implicit conversion warnings	2019-10-27 03:04:50 -03:00
Rodrigo Locatti	26f3e18c5c	Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased Implement Fast BRX, fix TXQ and addapt the Shader Cache for it	2019-10-26 16:56:13 -03:00
Fernando Sahmkow	be856a38d6	Shader_IR: Address Feedback.	2019-10-26 15:38:30 -04:00
Rodrigo Locatti	a0d79085c4	Merge pull request #3027 from lioncash/lookup shader_ir: Use std::array with std::pair instead of std::unordered_map	2019-10-26 05:49:15 -03:00
Rodrigo Locatti	d52598173d	Merge pull request #3013 from FernandoS27/tld4s-fix Shader_Ir: Fix TLD4S from using a component mask.	2019-10-25 20:06:26 -03:00
Fernando Sahmkow	e3afd6595a	Shader_IR: Clang format	2019-10-25 09:01:32 -04:00
ReinUsesLisp	78f3e8a757	gl_shader_cache: Implement locker variants invalidation	2019-10-25 09:01:32 -04:00
ReinUsesLisp	ec85648af3	gl_shader_disk_cache: Store and load fast BRX	2019-10-25 09:01:31 -04:00
ReinUsesLisp	fa2c297f3e	const_buffer_locker: Minor style changes	2019-10-25 09:01:31 -04:00
ReinUsesLisp	7b81ba4d8a	gl_shader_decompiler: Move entries to a separate function	2019-10-25 09:01:31 -04:00
Fernando Sahmkow	1244f2d368	Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.	2019-10-25 09:01:31 -04:00
Fernando Sahmkow	a05120ec0b	Shader_IR: Correct typo in Consistent method.	2019-10-25 09:01:30 -04:00
Fernando Sahmkow	33fcec3502	Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it	2019-10-25 09:01:30 -04:00
Fernando Sahmkow	8909f52166	Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.	2019-10-25 09:01:30 -04:00
Fernando Sahmkow	acd6441134	Shader_Cache: setup connection of ConstBufferLocker	2019-10-25 09:01:29 -04:00
Fernando Sahmkow	1a58f45d76	VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders.	2019-10-25 09:01:29 -04:00
Fernando Sahmkow	2ef696c85a	Shader_IR: Implement BRX tracking.	2019-10-25 09:01:29 -04:00
Rodrigo Locatti	5062728669	Merge pull request #3028 from lioncash/constexpr shader_bytecode: Make Matcher constexpr capable	2019-10-24 15:10:40 -03:00
Lioncash	7fdf991097	shader_bytecode: Make Matcher constexpr capable Greatly shrinks the amount of generated code for GetDecodeTable(). Collapses an assembly output of 9000+ lines down to ~3621 with Clang, and 6513 down to ~2616 with GCC, given it's now allowed to construct all the entries as a sequence of constant data.	2019-10-24 01:10:10 -04:00
Lioncash	382717172e	shader_ir: Use std::array with pair instead of unordered_map Given the overall size of the maps are very small, we can use arrays of pairs here instead of always heap allocating a new map every time the functions are called. Given the small size of the maps, the difference in container lookups are negligible, especially given the entries are already sorted.	2019-10-24 00:25:38 -04:00
Lioncash	1f5401c89c	video_core/shader: Resolve instances of variable shadowing Silences a few -Wshadow warnings.	2019-10-23 23:00:31 -04:00
Fernando Sahmkow	c4a0aa9207	Merge pull request #2995 from ReinUsesLisp/ignore-gmem shader_ir/memory: Ignore global memory when tracking fails	2019-10-22 13:22:43 -04:00
Fernando Sahmkow	7ecf9f7228	Merge pull request #2983 from lioncash/fallthrough gl_shader_decompiler/vk_shader_decompiler: Resolve implicit fallthrough cases	2019-10-22 13:16:46 -04:00
Fernando Sahmkow	1509d2ffbd	Shader_Ir: Fix TLD4S from using a component mask. TLD4S always outputs 4 values, the previous code checked a component mask and omitted those values that weren't part of it. This commit corrects that and makes sure all 4 values are set.	2019-10-22 10:59:07 -04:00
ReinUsesLisp	1ea07954fb	shader_ir/memory: Ignore global memory when tracking fails Ignore global memory operations instead of invoking undefined behaviour when constant buffer tracking fails and we are blasting through asserts, ignore the operation. In the case of LDG this means filling the destination registers with zeroes; for STG this means ignore the instruction as a whole. The default behaviour is still to abort execution on failure.	2019-10-22 02:49:17 -03:00

... 4 5 6 7 8 ...

3907 commits