B3n30
dc6a365337
Merge pull request #2951 from huwpascoe/perf-4
...
Optimized Morton
2017-09-25 08:28:55 +02:00
Huw Pascoe
903906da3b
Optimized Float<M,E> multiplication
...
Before:
ucomiss xmm1, xmm1
jp .L9
pxor xmm2, xmm2
mov edx, 1
ucomiss xmm0, xmm2
setp al
cmovne eax, edx
test al, al
jne .L9
.L3:
movaps xmm0, xmm2
ret
.L9:
ucomiss xmm0, xmm0
jp .L10
pxor xmm2, xmm2
mov edx, 1
ucomiss xmm1, xmm2
setp al
cmovne eax, edx
test al, al
je .L3
After:
movaps xmm2, xmm1
mulss xmm2, xmm0
ucomiss xmm2, xmm2
jnp .L3
ucomiss xmm1, xmm0
jnp .L11
.L3:
movaps xmm0, xmm2
ret
.L11:
pxor xmm2, xmm2
jmp .L3
2017-09-25 00:54:02 +01:00
Huw Pascoe
876aa82c29
Optimized Morton
2017-09-24 22:27:14 +01:00
James Rowe
93930a966f
Merge pull request #2921 from jroweboy/batch-fix-2
...
GPU: Add draw for immediate and batch modes
2017-09-24 07:57:16 -06:00
James Rowe
19d41dcc6e
Remove pipeline.gpu_mode and fix minor issues
2017-09-23 09:28:20 -06:00
Yuri Kunde Schlesner
a7758b0b36
Merge pull request #2928 from huwpascoe/master
...
Fixed framebuffer warning
2017-09-22 04:06:38 +02:00
Huw Pascoe
a234e4c200
Improved performance of FromAttributeBuffer
...
Ternary operator is optimized by the compiler
whereas std::min() is meant to return a value.
I've noticed a 5%-10% emulation speed increase.
2017-09-17 15:56:36 +01:00
Huw Pascoe
6a110ac5f5
Fixed framebuffer warning
2017-09-17 11:57:06 +01:00
Yuri Kunde Schlesner
699c920991
Merge pull request #2900 from wwylele/clip-2
...
PICA: implement custom clip plane
2017-09-16 10:23:00 +02:00
James Rowe
ad0b57f407
GPU: Add draw for immediate and batch modes
...
PR #1461 introduced a regression where some games would change configuration
even while in the poorly named "drawing" mode, which broke the heuristic
citra was using to determine when to draw the batch. This change adds
back in a draw call for batching, and also adds in a draw call in
immediate mode each time it adds a triangle.
2017-09-11 09:21:43 -06:00
bunnei
11baa40d75
Merge pull request #2865 from wwylele/gs++
...
PICA: implemented geometry shader
2017-09-07 23:02:59 -04:00
bunnei
ff4941fb3a
Merge pull request #2914 from wwylele/fresnel-fix
...
pica/lighting: only apply Fresnel factor for the last light
2017-09-05 10:00:49 -04:00
wwylele
12fbc8c8df
pica/lighting: only apply Fresnel factor for the last light
2017-09-03 08:22:03 +03:00
wwylele
e2c41a5891
video_core: report telemetry for gas mode
2017-08-31 12:54:17 +03:00
bunnei
f0e461bf6f
Merge pull request #2891 from wwylele/sw-bump
...
SwRasterizer/Lighting: implement bump mapping
2017-08-30 21:07:30 -04:00
Weiyi Wang
647f017c6d
Merge pull request #2892 from Subv/warnings2
...
Warnings: Fixed a few missing-return warnings in video_core.
2017-08-28 03:21:51 -05:00
Subv
da88f3b8f0
Warnings: Fixed a few missing-return warnings in video_core.
2017-08-26 11:58:22 -05:00
wwylele
417cb45e3f
SwRasterizer/Clipper: flip the sign convention to match PICA and OpenGL
2017-08-25 07:26:45 +03:00
wwylele
addbcd5784
gl_rasterizer: implement custom clip plane
2017-08-25 07:26:45 +03:00
wwylele
ea51a3af26
SwRasterizer: implement custom clip plane
2017-08-24 15:34:27 +03:00
wwylele
17c6104d2a
gl_rasterizer/lighting: more accurate CP formula
2017-08-22 09:34:44 +03:00
wwylele
b5aa570354
SwRasterizer/Lighting: implement LUT input CP
2017-08-22 09:34:44 +03:00
wwylele
3e478ca131
SwRasterizer/Lighting: implement bump mapping
2017-08-22 09:34:44 +03:00
wwylele
63b6e802cd
swrasterizer: remove invalid TODO
...
This function is called in clipping, before the pespective divide, and is not used in later rasterization. Thus it doesn't need perspective correction.
2017-08-21 08:03:07 +03:00
wwylele
72b26ac32f
swrasterizer/clipper: remove tested TODO
...
hwtested. Current implementation is the correct behavior
2017-08-21 08:03:07 +03:00
wwylele
5a4af616c6
gl_shader_gen: simplify and clarify the depth transformation between vertex shader and fragment shader
2017-08-21 08:03:07 +03:00
wwylele
1eca380886
gl_rasterizer: add clipping plane z<=0 defined in PICA
2017-08-21 08:03:07 +03:00
Yuri Kunde Schlesner
46d1ca768d
Merge pull request #2872 from wwylele/sw-geo-factor
...
SwRasterizer/Lighting: implement geometric factor
2017-08-20 17:49:42 -07:00
James Rowe
8afa81ac1b
Merge pull request #2871 from wwylele/sw-spotlight
...
SwRasterizer/Lighting: implement spot light
2017-08-19 20:10:24 -06:00
wwylele
0f35755572
pica/command_processor: build geometry pipeline and run geometry shader
...
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes:
- no GS mode: sends VS output directly to the primitive assembler (what citra currently does)
- GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler
- GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size.
- GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size.
hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode.
In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input.
In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense).
Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
2017-08-19 10:13:20 +03:00
wwylele
8285ca4ad8
pica/shader/jit: implement SETEMIT and EMIT
2017-08-19 10:13:20 +03:00
wwylele
36981a5aa6
pica/primitive_assembly: Handle winding for GS primitive
...
hwtest shows that, although GS always emit a group of three vertices as one primitive, it still respects to the topology type, as if the three vertices are input into the primitive assembler independently and sequentially. It is also shown that the winding flag in SETEMIT only takes effect for Shader topology type, which is believed to be the actual difference between List and Shader (hence removed the TODO). However, only Shader topology type is observed in official games when GS is in use, so the other mode seems to be just unintended usage.
2017-08-19 10:13:20 +03:00
wwylele
bb63ae3052
correct constness
2017-08-19 10:13:20 +03:00
wwylele
28128348f2
pica/shader/interpreter: implement SETEMIT and EMIT
2017-08-19 10:13:20 +03:00
wwylele
46c6973d2b
pica/shader: extend UnitState for GS
...
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
2017-08-19 10:13:20 +03:00
wwylele
686fb3e78c
gl_shader_gen: don't call SampleTexture when bump map is not used
2017-08-11 18:35:00 +03:00
wwylele
945f9a1b04
SwRasterizer/Lighting: implement spot light
2017-08-11 01:19:10 +03:00
wwylele
14ee32c46a
SwRasterizer/Lighting: implement geometric factor
2017-08-11 01:18:43 +03:00
wwylele
5d9d42f0d0
SwRasterizer/Lighting: use make_tuple instead of constructor
...
implicit tuple constructor is a c++17 thing, which is not supported by some not-so-old libraries. Play safe for now
2017-08-10 12:19:58 +03:00
wwylele
db309b2423
pica/regs: layout geometry shader configuration regs
...
All the register meanings are derived from ctrulib (3dbrew is outdated for most of them)
2017-08-10 01:53:08 +03:00
Weiyi Wang
792dee47a7
Merge pull request #2822 from wwylele/sw_lighting-2
...
Implement fragment lighting in the sw renderer (take 2)
2017-08-09 18:54:29 +03:00
wwylele
baa24f4ea9
pica: upload shared shader code to both unit
2017-08-07 10:30:05 +03:00
wwylele
2252a63f80
SwRasterizer/Lighting: shorten file name
2017-08-03 13:51:22 +03:00
wwylele
eda28266fb
SwRasterizer/Lighting: move to its own file
2017-08-02 22:20:40 +03:00
wwylele
48b4105871
SwRasterizer/Lighting: reduce confusion
2017-08-02 22:07:15 +03:00
wwylele
c59ed47608
SwRasterizer/Lighting: move quaternion normalization to the caller
2017-08-02 22:05:53 +03:00
wwylele
c89f804a01
pica/shader_interpreter: fix off-by-one in LOOP
2017-07-27 13:48:27 +03:00
Sebastian Valle
c6a2e519ef
Merge pull request #2816 from wwylele/proctex-lutlutlut
...
gl_rasterizer: use texture buffer for proctex LUT
2017-07-22 23:03:48 -05:00
Sebastian Valle
e646bd902d
Merge pull request #2834 from wwylele/depth-enable-fix
...
gl_rasterizer_cache: fix using_depth_fb
2017-07-22 23:02:59 -05:00
bunnei
df8b9863f9
telemetry: Log performance, configuration, and system data.
2017-07-17 21:32:28 -04:00
wwylele
4feff63ffa
SwRasterizer/Lighting: dist atten lut input need to be clamp
2017-07-11 22:19:00 +03:00
wwylele
56e5425e59
SwRasterizer/Lighting: unify float suffix
2017-07-11 22:15:35 +03:00
wwylele
e415558a4f
SwRasterizer/Lighting: get rid of nested return
2017-07-11 22:15:35 +03:00
wwylele
c6d1472513
SwRasterizer/Lighting: refactor GetLutValue into a function.
...
merging similar pattern. Also makes the code more similar to the gl one
2017-07-11 22:15:35 +03:00
wwylele
f13cf506e0
SwRasterizer: only interpolate quat and view when lighting is enabled
2017-07-11 21:35:57 +03:00
wwylele
efc655aec0
SwRasterizer/Lighting: pass lighting state as parameter
2017-07-11 20:06:26 +03:00
Subv
9906feefbd
SwRasterizer/Lighting: Move the clamp highlight calculation to the end of the per-light loop body.
2017-07-11 19:39:15 +03:00
Subv
7526af5e52
SwRasterizer/Lighting: Move the lighting enable check outside the ComputeFragmentsColors function.
2017-07-11 19:39:15 +03:00
Subv
b8229a7684
SwRasterizer/Lighting: Do not use global registers state in ComputeFragmentsColors.
2017-07-11 19:39:15 +03:00
Subv
7bc467e872
SwRasterizer/Lighting: Do not use global state in LookupLightingLut.
2017-07-11 19:39:15 +03:00
Subv
37ac2b6657
SwRasterizer/Lighting: Fixed a bug where the distance attenuation bias was being set to the dist atten scale.
2017-07-11 19:39:15 +03:00
Subv
6250f52e93
SwRasterizer: Fixed a few conversion warnings and moved per-light values into the per-light loop.
2017-07-11 19:39:15 +03:00
Subv
2d69a9b8bf
SwRasterizer: Run clang-format
2017-07-11 19:39:15 +03:00
Subv
73566ff7a9
SwRasterizer: Flip the vertex quaternions before clipping (if necessary).
2017-07-11 19:39:15 +03:00
Subv
2a75837bc3
SwRasterizer: Corrected the light LUT lookups.
2017-07-11 19:39:15 +03:00
Subv
f2d4d5c219
SwRasterizer: Corrected the light LUT lookups.
2017-07-11 19:39:15 +03:00
Subv
80b6fc592e
SwRasterizer: Fixed the lighting lut lookup function.
2017-07-11 19:39:15 +03:00
Subv
10b0bea060
SwRasterizer: Calculate fresnel for fragment lighting.
2017-07-11 19:39:15 +03:00
Subv
46b8c8e1da
SwRasterizer: Calculate specular_1 for fragment lighting.
2017-07-11 19:39:15 +03:00
Subv
be25e78b07
SwRasterizer: Calculate specular_0 for fragment lighting.
2017-07-11 19:39:15 +03:00
Subv
b2f472a2b1
SwRasterizer: Implement primary fragment color.
2017-07-11 19:39:15 +03:00
wwylele
8482933db8
gl_rasterizer: use texture buffer for proctex LUT
2017-07-01 11:02:48 +03:00
wwylele
8978ecb09c
gl_rasterizer: use texture buffer for fog LUT
2017-06-22 20:41:00 +03:00
wwylele
f1e377f57e
gl_rasterizer: create the texture before applying the state
...
this is a rebasing error from #2792 . It doesn't affect much though, because the later more Apply() call fixes/hides it
2017-06-22 17:47:46 +03:00
wwylele
457659fe01
gl_state: reset 1d textures
2017-06-21 23:13:06 +03:00
wwylele
42f7ca7412
gl_rasterizer: fix glGetUniformLocation type
2017-06-21 23:13:06 +03:00
wwylele
be9e952bdc
gl_rasterizer: manage texture ids in one place
2017-06-21 23:13:06 +03:00
wwylele
ab60414122
gl_rasterizer/lighting: fix LUT interpolation
2017-06-21 23:13:06 +03:00
Yuri Kunde Schlesner
d0888f8548
Merge pull request #2776 from wwylele/geo-factor
...
Fragment lighting: implement geometric factor
2017-06-18 14:18:48 -07:00
wwylele
5a454173a8
gl_rasterizer/lighting: use the formula from the paper for germetic factor
2017-06-18 10:29:02 +03:00
Yuri Kunde Schlesner
f6715f98f5
Stop using reserved operator names (and/or/xor) with Xbyak
...
Also has the Dynarmic upgrade with the same change
2017-06-17 12:20:22 -07:00
wwylele
7052d43a67
gl_rasterizer/lighting: implement geometric factor
2017-06-15 14:59:01 +03:00
Yuri Kunde Schlesner
da1bec121a
Merge pull request #2762 from wwylele/light-cp-tangent
...
Fragment lighting: implement lut input 5 (CP) and tangent mapping
2017-06-14 20:08:26 -07:00
Yuri Kunde Schlesner
5fe5ccac42
Merge pull request #2743 from wwylele/wrap-fix
...
pica/rasterizer: implement/stub texture wrap mode 4-7
2017-06-13 21:28:12 -07:00
Yuri Kunde Schlesner
791cd14c8d
Merge pull request #2767 from yuriks/quaternion-flip-comment
...
OpenGL: Update comment on AreQuaternionsOpposite with new information
2017-06-12 16:31:55 -07:00
wwylele
972548e3ee
gl_rasterizer/lighting: Implement tangent mapping
2017-06-11 21:30:53 +03:00
wwylele
40b7d0bf3f
gl_rasterizer/lighting: implement lut input 5 (CP)
2017-06-11 21:30:53 +03:00
Sebastian Valle
39c7c1f580
Merge pull request #2727 from wwylele/spot-light
...
Fragment lighting: implement spot light
2017-06-11 18:23:47 +00:00
wwylele
b3b9468573
gl_rasterizer_cache: depth write is disabled if allow_depth_stencil_write is false
2017-06-10 15:10:34 +03:00
Yuri Kunde Schlesner
ba01a8302a
OpenGL: Update comment on AreQuaternionsOpposite with new information
...
While debugging the software renderer implementation, it was noticed
that this is actually exactly what the hardware does, upgrading the
status of this "hack" to being a proper implementation. And there was
much rejoicing.
2017-06-10 01:55:17 -07:00
wwylele
28d1e73d2f
pica/rasterizer: implement/stub texture wrap mode 4-7
2017-06-04 09:47:25 +03:00
bunnei
54ea95cca7
Merge pull request #2721 from wwylele/texture-cube
...
swrasterizer: implemented TextureCube
2017-05-30 10:21:05 -04:00
wwylele
10906dceec
gl_rasterizer: implement spot light
2017-05-30 10:54:58 +03:00
wwylele
686cbf3ac6
gl_rasterizer: sync spot light status
2017-05-30 10:54:58 +03:00
wwylele
b5addf8fb8
pica: prepare registers for spotlight
2017-05-30 10:54:58 +03:00
Yuri Kunde Schlesner
a4f88c7d7c
Merge pull request #2734 from yuriks/cmake-imported-libs
...
CMake: Use CMake target properties for all libraries
2017-05-29 15:12:21 -07:00
wwylele
0b9bb082c3
swrasterizer: implement TextureCube
2017-05-29 22:28:48 +03:00
wwylele
077cc683e5
pica: add registers for texture cube
2017-05-29 22:03:08 +03:00
Yuri Kunde Schlesner
3df85a103a
Merge pull request #2729 from yuriks/quaternion-fix
...
OpenGL: Improve accuracy of quaternion interpolation
2017-05-28 01:24:06 -07:00
Yuri Kunde Schlesner
d736cca848
CMake: Create INTERFACE targets for microprofile and nihstro
2017-05-27 22:34:52 -07:00