2019-10-13 02:02:07 -04:00
using Ryujinx.Common ;
2019-12-28 18:45:33 -05:00
using Ryujinx.Common.Logging ;
2019-10-13 02:02:07 -04:00
using Ryujinx.Graphics.GAL ;
2021-01-17 13:44:34 -05:00
using Ryujinx.Graphics.Gpu.Memory ;
2019-10-13 02:02:07 -04:00
using Ryujinx.Graphics.Texture ;
using Ryujinx.Graphics.Texture.Astc ;
Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494)
* Return mapped buffer pointer directly for flush, WriteableRegion for textures
A few changes here to generally improve performance, even for platforms not using the persistent buffer flush.
- Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again.
- As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time.
- Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion.
Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides.
* Fix tests
* Fix array pointer for Mesa/Intel path
* Address some feedback
* Update method for getting array pointer.
2021-07-19 18:10:54 -04:00
using Ryujinx.Memory ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
using Ryujinx.Memory.Range ;
2019-10-13 02:02:07 -04:00
using System ;
using System.Collections.Generic ;
2019-10-30 19:45:01 -04:00
using System.Diagnostics ;
2021-08-29 15:22:13 -04:00
using System.Linq ;
2021-12-08 16:09:36 -05:00
using System.Numerics ;
2019-10-13 02:02:07 -04:00
namespace Ryujinx.Graphics.Gpu.Image
{
2019-12-29 18:26:37 -05:00
/// <summary>
/// Represents a cached GPU texture.
/// </summary>
2021-01-17 13:44:34 -05:00
class Texture : IMultiRangeItem , IDisposable
2019-10-13 02:02:07 -04:00
{
2020-10-25 16:09:45 -04:00
// How many updates we need before switching to the byte-by-byte comparison
// modification check method.
// This method uses much more memory so we want to avoid it if possible.
private const int ByteComparisonSwitchThreshold = 4 ;
2021-12-08 16:09:36 -05:00
private const int MinLevelsForForceAnisotropy = 5 ;
2021-03-06 09:43:55 -05:00
private struct TexturePoolOwner
{
public TexturePool Pool ;
public int ID ;
}
2019-10-13 02:02:07 -04:00
private GpuContext _context ;
2021-06-29 13:32:02 -04:00
private PhysicalMemory _physicalMemory ;
2019-10-13 02:02:07 -04:00
private SizeInfo _sizeInfo ;
2019-12-29 18:26:37 -05:00
/// <summary>
/// Texture format.
/// </summary>
2019-12-29 12:41:50 -05:00
public Format Format = > Info . FormatInfo . Format ;
2019-10-13 02:02:07 -04:00
2020-12-03 14:34:27 -05:00
/// <summary>
/// Texture target.
/// </summary>
public Target Target { get ; private set ; }
2022-01-11 14:15:17 -05:00
/// <summary>
/// Texture width.
/// </summary>
public int Width { get ; private set ; }
/// <summary>
/// Texture height.
/// </summary>
public int Height { get ; private set ; }
2019-12-29 18:26:37 -05:00
/// <summary>
/// Texture information.
/// </summary>
2019-12-29 12:41:50 -05:00
public TextureInfo Info { get ; private set ; }
2019-10-13 02:02:07 -04:00
2021-12-08 16:09:36 -05:00
/// <summary>
/// Set when anisotropic filtering can be forced on the given texture.
/// </summary>
public bool CanForceAnisotropy { get ; private set ; }
2020-07-06 22:41:07 -04:00
/// <summary>
/// Host scale factor.
/// </summary>
public float ScaleFactor { get ; private set ; }
/// <summary>
/// Upscaling mode. Informs if a texture is scaled, or is eligible for scaling.
/// </summary>
public TextureScaleMode ScaleMode { get ; private set ; }
2021-03-02 17:30:54 -05:00
/// <summary>
/// Group that this texture belongs to. Manages read/write memory tracking.
/// </summary>
public TextureGroup Group { get ; private set ; }
2020-11-09 19:41:13 -05:00
/// <summary>
/// Set when a texture has been changed size. This indicates that it may need to be
/// changed again when obtained as a sampler.
/// </summary>
2021-03-06 09:43:55 -05:00
public bool ChangedSize { get ; private set ; }
/// <summary>
2021-04-02 10:33:39 -04:00
/// Set when a texture's GPU VA has ever been partially or fully unmapped.
2021-03-06 09:43:55 -05:00
/// This indicates that the range must be fully checked when matching the texture.
/// </summary>
public bool ChangedMapping { get ; private set ; }
2020-11-09 19:41:13 -05:00
2022-01-09 11:28:48 -05:00
/// <summary>
/// True if the data for this texture must always be flushed when an overlap appears.
/// This is useful if SetData is called directly on this texture, but the data is meant for a future texture.
/// </summary>
public bool AlwaysFlushOnOverlap { get ; private set ; }
2019-10-13 02:02:07 -04:00
private int _depth ;
private int _layers ;
2021-03-02 17:30:54 -05:00
public int FirstLayer { get ; private set ; }
public int FirstLevel { get ; private set ; }
2019-10-13 02:02:07 -04:00
private bool _hasData ;
2021-03-02 17:30:54 -05:00
private bool _dirty = true ;
2020-10-25 16:09:45 -04:00
private int _updateCount ;
private byte [ ] _currentData ;
2019-10-13 02:02:07 -04:00
2022-01-09 11:28:48 -05:00
private bool _modifiedStale = true ;
2019-10-13 02:02:07 -04:00
private ITexture _arrayViewTexture ;
private Target _arrayViewTarget ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
private ITexture _flushHostTexture ;
2019-10-13 02:02:07 -04:00
private Texture _viewStorage ;
private List < Texture > _views ;
2019-12-29 18:26:37 -05:00
/// <summary>
/// Host texture.
/// </summary>
2019-10-13 02:02:07 -04:00
public ITexture HostTexture { get ; private set ; }
2019-12-29 18:26:37 -05:00
/// <summary>
/// Intrusive linked list node used on the auto deletion texture cache.
/// </summary>
2019-10-13 02:02:07 -04:00
public LinkedListNode < Texture > CacheNode { get ; set ; }
2020-02-06 16:49:26 -05:00
/// <summary>
/// Event to fire when texture data is disposed.
/// </summary>
public event Action < Texture > Disposed ;
2019-10-13 02:02:07 -04:00
2019-12-29 18:26:37 -05:00
/// <summary>
2021-01-17 13:44:34 -05:00
/// Physical memory ranges where the texture data is located.
2019-12-29 18:26:37 -05:00
/// </summary>
2021-01-17 13:44:34 -05:00
public MultiRange Range { get ; private set ; }
2019-10-13 02:02:07 -04:00
2021-03-02 17:30:54 -05:00
/// <summary>
/// Layer size in bytes.
/// </summary>
public int LayerSize = > _sizeInfo . LayerSize ;
2019-12-29 18:26:37 -05:00
/// <summary>
/// Texture size in bytes.
/// </summary>
2019-10-13 02:02:07 -04:00
public ulong Size = > ( ulong ) _sizeInfo . TotalSize ;
2021-03-02 17:30:54 -05:00
/// <summary>
/// Whether or not the texture belongs is a view.
/// </summary>
public bool IsView = > _viewStorage ! = this ;
2020-05-03 18:54:50 -04:00
2019-10-13 02:02:07 -04:00
private int _referenceCount ;
2021-03-06 09:43:55 -05:00
private List < TexturePoolOwner > _poolOwners ;
2019-10-13 02:02:07 -04:00
2019-12-29 18:26:37 -05:00
/// <summary>
/// Constructs a new instance of the cached GPU texture.
/// </summary>
/// <param name="context">GPU context that the texture belongs to</param>
2021-06-29 13:32:02 -04:00
/// <param name="physicalMemory">Physical memory where the texture is mapped</param>
2019-12-29 18:26:37 -05:00
/// <param name="info">Texture information</param>
/// <param name="sizeInfo">Size information of the texture</param>
2021-01-17 13:44:34 -05:00
/// <param name="range">Physical memory ranges where the texture data is located</param>
2019-12-29 18:26:37 -05:00
/// <param name="firstLayer">The first layer of the texture, or 0 if the texture has no parent</param>
/// <param name="firstLevel">The first mipmap level of the texture, or 0 if the texture has no parent</param>
2020-07-06 22:41:07 -04:00
/// <param name="scaleFactor">The floating point scale factor to initialize with</param>
/// <param name="scaleMode">The scale mode to initialize with</param>
2019-10-13 02:02:07 -04:00
private Texture (
2021-06-29 13:32:02 -04:00
GpuContext context ,
PhysicalMemory physicalMemory ,
TextureInfo info ,
SizeInfo sizeInfo ,
MultiRange range ,
int firstLayer ,
int firstLevel ,
float scaleFactor ,
2020-07-06 22:41:07 -04:00
TextureScaleMode scaleMode )
2019-10-13 02:02:07 -04:00
{
2021-06-29 13:32:02 -04:00
InitializeTexture ( context , physicalMemory , info , sizeInfo , range ) ;
2019-10-13 02:02:07 -04:00
2021-03-02 17:30:54 -05:00
FirstLayer = firstLayer ;
FirstLevel = firstLevel ;
2019-10-13 02:02:07 -04:00
2020-07-06 22:41:07 -04:00
ScaleFactor = scaleFactor ;
ScaleMode = scaleMode ;
2020-09-10 15:44:04 -04:00
InitializeData ( true ) ;
2019-10-13 02:02:07 -04:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Constructs a new instance of the cached GPU texture.
/// </summary>
/// <param name="context">GPU context that the texture belongs to</param>
2021-06-29 13:32:02 -04:00
/// <param name="physicalMemory">Physical memory where the texture is mapped</param>
2019-12-29 18:26:37 -05:00
/// <param name="info">Texture information</param>
/// <param name="sizeInfo">Size information of the texture</param>
2021-01-17 13:44:34 -05:00
/// <param name="range">Physical memory ranges where the texture data is located</param>
2020-07-06 22:41:07 -04:00
/// <param name="scaleMode">The scale mode to initialize with. If scaled, the texture's data is loaded immediately and scaled up</param>
2021-06-29 13:32:02 -04:00
public Texture (
GpuContext context ,
PhysicalMemory physicalMemory ,
TextureInfo info ,
SizeInfo sizeInfo ,
MultiRange range ,
TextureScaleMode scaleMode )
2019-10-13 02:02:07 -04:00
{
2020-07-06 22:41:07 -04:00
ScaleFactor = 1f ; // Texture is first loaded at scale 1x.
ScaleMode = scaleMode ;
2021-06-29 13:32:02 -04:00
InitializeTexture ( context , physicalMemory , info , sizeInfo , range ) ;
2019-10-13 02:02:07 -04:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Common texture initialization method.
/// This sets the context, info and sizeInfo fields.
/// Other fields are initialized with their default values.
/// </summary>
/// <param name="context">GPU context that the texture belongs to</param>
2021-06-29 13:32:02 -04:00
/// <param name="physicalMemory">Physical memory where the texture is mapped</param>
2019-12-29 18:26:37 -05:00
/// <param name="info">Texture information</param>
/// <param name="sizeInfo">Size information of the texture</param>
2021-01-17 13:44:34 -05:00
/// <param name="range">Physical memory ranges where the texture data is located</param>
2021-06-29 13:32:02 -04:00
private void InitializeTexture (
GpuContext context ,
PhysicalMemory physicalMemory ,
TextureInfo info ,
SizeInfo sizeInfo ,
MultiRange range )
2019-10-13 02:02:07 -04:00
{
2021-06-29 13:32:02 -04:00
_context = context ;
_physicalMemory = physicalMemory ;
2019-10-13 02:02:07 -04:00
_sizeInfo = sizeInfo ;
2021-06-29 13:32:02 -04:00
Range = range ;
2019-10-13 02:02:07 -04:00
SetInfo ( info ) ;
_viewStorage = this ;
_views = new List < Texture > ( ) ;
2021-03-06 09:43:55 -05:00
_poolOwners = new List < TexturePoolOwner > ( ) ;
2019-10-13 02:02:07 -04:00
}
2020-09-10 15:44:04 -04:00
/// <summary>
/// Initializes the data for a texture. Can optionally initialize the texture with or without data.
/// If the texture is a view, it will initialize memory tracking to be non-dirty.
/// </summary>
/// <param name="isView">True if the texture is a view, false otherwise</param>
/// <param name="withData">True if the texture is to be initialized with data</param>
public void InitializeData ( bool isView , bool withData = false )
{
2022-01-09 11:28:48 -05:00
withData | = Group ! = null & & Group . FlushIncompatibleOverlapsIfNeeded ( ) ;
2020-09-10 15:44:04 -04:00
if ( withData )
{
Debug . Assert ( ! isView ) ;
2021-06-23 19:51:41 -04:00
TextureCreateInfo createInfo = TextureCache . GetCreateInfo ( Info , _context . Capabilities , ScaleFactor ) ;
2020-09-10 15:44:04 -04:00
HostTexture = _context . Renderer . CreateTexture ( createInfo , ScaleFactor ) ;
SynchronizeMemory ( ) ; // Load the data.
if ( ScaleMode = = TextureScaleMode . Scaled )
{
SetScale ( GraphicsConfig . ResScale ) ; // Scale the data up.
}
}
else
{
_hasData = true ;
if ( ! isView )
{
2021-03-02 17:30:54 -05:00
// Don't update this texture the next time we synchronize.
2021-08-20 16:52:09 -04:00
CheckModified ( true ) ;
2021-03-02 17:30:54 -05:00
2020-09-10 15:44:04 -04:00
if ( ScaleMode = = TextureScaleMode . Scaled )
{
// Don't need to start at 1x as there is no data to scale, just go straight to the target scale.
ScaleFactor = GraphicsConfig . ResScale ;
}
2021-06-23 19:51:41 -04:00
TextureCreateInfo createInfo = TextureCache . GetCreateInfo ( Info , _context . Capabilities , ScaleFactor ) ;
2020-09-10 15:44:04 -04:00
HostTexture = _context . Renderer . CreateTexture ( createInfo , ScaleFactor ) ;
}
}
}
2021-03-02 17:30:54 -05:00
/// <summary>
/// Initialize a new texture group with this texture as storage.
/// </summary>
/// <param name="hasLayerViews">True if the texture will have layer views</param>
/// <param name="hasMipViews">True if the texture will have mip views</param>
2022-01-09 11:28:48 -05:00
/// <param name="incompatibleOverlaps">Groups that overlap with this one but are incompatible</param>
public void InitializeGroup ( bool hasLayerViews , bool hasMipViews , List < TextureIncompatibleOverlap > incompatibleOverlaps )
2021-03-02 17:30:54 -05:00
{
2022-01-09 11:28:48 -05:00
Group = new TextureGroup ( _context , _physicalMemory , this , incompatibleOverlaps ) ;
2021-03-02 17:30:54 -05:00
Group . Initialize ( ref _sizeInfo , hasLayerViews , hasMipViews ) ;
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Create a texture view from this texture.
/// A texture view is defined as a child texture, from a sub-range of their parent texture.
/// For example, the initial layer and mipmap level of the view can be defined, so the texture
/// will start at the given layer/level of the parent texture.
/// </summary>
/// <param name="info">Child texture information</param>
/// <param name="sizeInfo">Child texture size information</param>
2021-01-17 13:44:34 -05:00
/// <param name="range">Physical memory ranges where the texture data is located</param>
2019-12-29 18:26:37 -05:00
/// <param name="firstLayer">Start layer of the child texture on the parent texture</param>
/// <param name="firstLevel">Start mipmap level of the child texture on the parent texture</param>
/// <returns>The child texture</returns>
2021-01-17 13:44:34 -05:00
public Texture CreateView ( TextureInfo info , SizeInfo sizeInfo , MultiRange range , int firstLayer , int firstLevel )
2019-10-13 02:02:07 -04:00
{
Texture texture = new Texture (
_context ,
2021-06-29 13:32:02 -04:00
_physicalMemory ,
2019-10-13 02:02:07 -04:00
info ,
sizeInfo ,
2021-01-17 13:44:34 -05:00
range ,
2021-03-02 17:30:54 -05:00
FirstLayer + firstLayer ,
FirstLevel + firstLevel ,
2020-07-06 22:41:07 -04:00
ScaleFactor ,
ScaleMode ) ;
2019-10-13 02:02:07 -04:00
2021-06-23 19:51:41 -04:00
TextureCreateInfo createInfo = TextureCache . GetCreateInfo ( info , _context . Capabilities , ScaleFactor ) ;
2019-10-13 02:02:07 -04:00
texture . HostTexture = HostTexture . CreateView ( createInfo , firstLayer , firstLevel ) ;
_viewStorage . AddView ( texture ) ;
return texture ;
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Adds a child texture to this texture.
/// </summary>
/// <param name="texture">The child texture</param>
2019-10-13 02:02:07 -04:00
private void AddView ( Texture texture )
{
2021-03-02 17:30:54 -05:00
IncrementReferenceCount ( ) ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
2019-10-13 02:02:07 -04:00
_views . Add ( texture ) ;
texture . _viewStorage = this ;
2021-03-02 17:30:54 -05:00
Group . UpdateViews ( _views ) ;
if ( texture . Group ! = null & & texture . Group ! = Group )
{
if ( texture . Group . Storage = = texture )
{
// This texture's group is no longer used.
Group . Inherit ( texture . Group ) ;
texture . Group . Dispose ( ) ;
}
}
texture . Group = Group ;
2019-10-13 02:02:07 -04:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Removes a child texture from this texture.
/// </summary>
/// <param name="texture">The child texture</param>
2019-10-13 02:02:07 -04:00
private void RemoveView ( Texture texture )
{
_views . Remove ( texture ) ;
2020-09-21 15:51:33 -04:00
texture . _viewStorage = texture ;
2019-10-30 19:45:01 -04:00
2021-03-02 17:30:54 -05:00
DecrementReferenceCount ( ) ;
}
/// <summary>
/// Create a copy dependency to a texture that is view compatible with this one.
/// When either texture is modified, the texture data will be copied to the other to keep them in sync.
/// This is essentially an emulated view, useful for handling multiple view parents or format incompatibility.
/// This also forces a copy on creation, to or from the given texture to get them in sync immediately.
/// </summary>
/// <param name="contained">The view compatible texture to create a dependency to</param>
/// <param name="layer">The base layer of the given texture relative to this one</param>
/// <param name="level">The base level of the given texture relative to this one</param>
/// <param name="copyTo">True if this texture is first copied to the given one, false for the opposite direction</param>
public void CreateCopyDependency ( Texture contained , int layer , int level , bool copyTo )
{
if ( contained . Group = = Group )
{
return ;
}
Group . CreateCopyDependency ( contained , FirstLayer + layer , FirstLevel + level , copyTo ) ;
2019-10-13 02:02:07 -04:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Changes the texture size.
2020-01-01 10:39:09 -05:00
/// </summary>
/// <remarks>
2019-12-29 18:26:37 -05:00
/// This operation may also change the size of all mipmap levels, including from the parent
/// and other possible child textures, to ensure that all sizes are consistent.
2020-01-01 10:39:09 -05:00
/// </remarks>
2019-12-29 18:26:37 -05:00
/// <param name="width">The new texture width</param>
/// <param name="height">The new texture height</param>
/// <param name="depthOrLayers">The new texture depth (for 3D textures) or layers (for layered textures)</param>
2019-10-13 02:02:07 -04:00
public void ChangeSize ( int width , int height , int depthOrLayers )
{
2020-09-10 15:44:04 -04:00
int blockWidth = Info . FormatInfo . BlockWidth ;
int blockHeight = Info . FormatInfo . BlockHeight ;
2021-03-02 17:30:54 -05:00
width < < = FirstLevel ;
height < < = FirstLevel ;
2019-10-13 02:02:07 -04:00
2020-12-03 14:34:27 -05:00
if ( Target = = Target . Texture3D )
2019-10-13 02:02:07 -04:00
{
2021-03-02 17:30:54 -05:00
depthOrLayers < < = FirstLevel ;
2019-10-13 02:02:07 -04:00
}
else
{
2019-12-29 12:41:50 -05:00
depthOrLayers = _viewStorage . Info . DepthOrLayers ;
2019-10-13 02:02:07 -04:00
}
2020-09-10 15:44:04 -04:00
_viewStorage . RecreateStorageOrView ( width , height , blockWidth , blockHeight , depthOrLayers ) ;
2019-10-13 02:02:07 -04:00
foreach ( Texture view in _viewStorage . _views )
{
2021-03-02 17:30:54 -05:00
int viewWidth = Math . Max ( 1 , width > > view . FirstLevel ) ;
int viewHeight = Math . Max ( 1 , height > > view . FirstLevel ) ;
2019-10-13 02:02:07 -04:00
int viewDepthOrLayers ;
2019-12-29 12:41:50 -05:00
if ( view . Info . Target = = Target . Texture3D )
2019-10-13 02:02:07 -04:00
{
2021-03-02 17:30:54 -05:00
viewDepthOrLayers = Math . Max ( 1 , depthOrLayers > > view . FirstLevel ) ;
2019-10-13 02:02:07 -04:00
}
else
{
2019-12-29 12:41:50 -05:00
viewDepthOrLayers = view . Info . DepthOrLayers ;
2019-10-13 02:02:07 -04:00
}
2020-09-10 15:44:04 -04:00
view . RecreateStorageOrView ( viewWidth , viewHeight , blockWidth , blockHeight , viewDepthOrLayers ) ;
2019-10-13 02:02:07 -04:00
}
}
2020-09-10 15:44:04 -04:00
/// <summary>
/// Recreates the texture storage (or view, in the case of child textures) of this texture.
/// This allows recreating the texture with a new size.
/// A copy is automatically performed from the old to the new texture.
/// </summary>
/// <param name="width">The new texture width</param>
/// <param name="height">The new texture height</param>
/// <param name="width">The block width related to the given width</param>
/// <param name="height">The block height related to the given height</param>
/// <param name="depthOrLayers">The new texture depth (for 3D textures) or layers (for layered textures)</param>
private void RecreateStorageOrView ( int width , int height , int blockWidth , int blockHeight , int depthOrLayers )
{
RecreateStorageOrView (
BitUtils . DivRoundUp ( width * Info . FormatInfo . BlockWidth , blockWidth ) ,
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
BitUtils . DivRoundUp ( height * Info . FormatInfo . BlockHeight , blockHeight ) ,
2020-09-10 15:44:04 -04:00
depthOrLayers ) ;
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Recreates the texture storage (or view, in the case of child textures) of this texture.
/// This allows recreating the texture with a new size.
/// A copy is automatically performed from the old to the new texture.
/// </summary>
/// <param name="width">The new texture width</param>
/// <param name="height">The new texture height</param>
/// <param name="depthOrLayers">The new texture depth (for 3D textures) or layers (for layered textures)</param>
2019-10-13 02:02:07 -04:00
private void RecreateStorageOrView ( int width , int height , int depthOrLayers )
{
2020-11-09 19:41:13 -05:00
ChangedSize = true ;
2019-10-13 02:02:07 -04:00
SetInfo ( new TextureInfo (
2021-01-17 13:44:34 -05:00
Info . GpuAddress ,
2019-10-13 02:02:07 -04:00
width ,
height ,
depthOrLayers ,
2019-12-29 12:41:50 -05:00
Info . Levels ,
Info . SamplesInX ,
Info . SamplesInY ,
Info . Stride ,
Info . IsLinear ,
Info . GobBlocksInY ,
Info . GobBlocksInZ ,
Info . GobBlocksInTileX ,
Info . Target ,
Info . FormatInfo ,
Info . DepthStencilMode ,
Info . SwizzleR ,
Info . SwizzleG ,
Info . SwizzleB ,
Info . SwizzleA ) ) ;
2021-06-23 19:51:41 -04:00
TextureCreateInfo createInfo = TextureCache . GetCreateInfo ( Info , _context . Capabilities , ScaleFactor ) ;
2019-10-13 02:02:07 -04:00
if ( _viewStorage ! = this )
{
2021-03-02 17:30:54 -05:00
ReplaceStorage ( _viewStorage . HostTexture . CreateView ( createInfo , FirstLayer , FirstLevel ) ) ;
2019-10-13 02:02:07 -04:00
}
else
{
2020-07-06 22:41:07 -04:00
ITexture newStorage = _context . Renderer . CreateTexture ( createInfo , ScaleFactor ) ;
2019-10-13 02:02:07 -04:00
2019-10-30 19:45:01 -04:00
HostTexture . CopyTo ( newStorage , 0 , 0 ) ;
2019-10-13 02:02:07 -04:00
ReplaceStorage ( newStorage ) ;
}
}
2020-07-06 22:41:07 -04:00
/// <summary>
/// Blacklists this texture from being scaled. Resets its scale to 1 if needed.
/// </summary>
public void BlacklistScale ( )
{
ScaleMode = TextureScaleMode . Blacklisted ;
SetScale ( 1f ) ;
}
/// <summary>
/// Propagates the scale between this texture and another to ensure they have the same scale.
/// If one texture is blacklisted from scaling, the other will become blacklisted too.
/// </summary>
/// <param name="other">The other texture</param>
public void PropagateScale ( Texture other )
{
if ( other . ScaleMode = = TextureScaleMode . Blacklisted | | ScaleMode = = TextureScaleMode . Blacklisted )
{
BlacklistScale ( ) ;
other . BlacklistScale ( ) ;
}
else
{
// Prefer the configured scale if present. If not, prefer the max.
float targetScale = GraphicsConfig . ResScale ;
float sharedScale = ( ScaleFactor = = targetScale | | other . ScaleFactor = = targetScale ) ? targetScale : Math . Max ( ScaleFactor , other . ScaleFactor ) ;
SetScale ( sharedScale ) ;
other . SetScale ( sharedScale ) ;
}
}
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
/// <summary>
/// Copy the host texture to a scaled one. If a texture is not provided, create it with the given scale.
/// </summary>
/// <param name="scale">Scale factor</param>
/// <param name="storage">Texture to use instead of creating one</param>
/// <returns>A host texture containing a scaled version of this texture</returns>
private ITexture GetScaledHostTexture ( float scale , ITexture storage = null )
{
if ( storage = = null )
{
2021-06-23 19:51:41 -04:00
TextureCreateInfo createInfo = TextureCache . GetCreateInfo ( Info , _context . Capabilities , scale ) ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
storage = _context . Renderer . CreateTexture ( createInfo , scale ) ;
}
2020-11-20 15:14:45 -05:00
HostTexture . CopyTo ( storage , new Extents2D ( 0 , 0 , HostTexture . Width , HostTexture . Height ) , new Extents2D ( 0 , 0 , storage . Width , storage . Height ) , true ) ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
return storage ;
}
2020-07-06 22:41:07 -04:00
/// <summary>
/// Sets the Scale Factor on this texture, and immediately recreates it at the correct size.
/// When a texture is resized, a scaled copy is performed from the old texture to the new one, to ensure no data is lost.
/// If scale is equivalent, this only propagates the blacklisted/scaled mode.
/// If called on a view, its storage is resized instead.
/// When resizing storage, all texture views are recreated.
/// </summary>
/// <param name="scale">The new scale factor for this texture</param>
public void SetScale ( float scale )
{
2021-08-11 16:44:51 -04:00
bool unscaled = ScaleMode = = TextureScaleMode . Blacklisted | | ( ScaleMode = = TextureScaleMode . Undesired & & scale = = 1 ) ;
TextureScaleMode newScaleMode = unscaled ? ScaleMode : TextureScaleMode . Scaled ;
2020-07-06 22:41:07 -04:00
if ( _viewStorage ! = this )
{
_viewStorage . ScaleMode = newScaleMode ;
_viewStorage . SetScale ( scale ) ;
return ;
}
if ( ScaleFactor ! = scale )
{
2020-08-03 19:32:53 -04:00
Logger . Debug ? . Print ( LogClass . Gpu , $"Rescaling {Info.Width}x{Info.Height} {Info.FormatInfo.Format.ToString()} to ({ScaleFactor} to {scale}). " ) ;
2020-07-06 22:41:07 -04:00
ScaleFactor = scale ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
ITexture newStorage = GetScaledHostTexture ( ScaleFactor ) ;
2020-07-06 22:41:07 -04:00
2020-08-03 19:32:53 -04:00
Logger . Debug ? . Print ( LogClass . Gpu , $" Copy performed: {HostTexture.Width}x{HostTexture.Height} to {newStorage.Width}x{newStorage.Height}" ) ;
2020-07-06 22:41:07 -04:00
ReplaceStorage ( newStorage ) ;
// All views must be recreated against the new storage.
foreach ( var view in _views )
{
2020-08-03 19:32:53 -04:00
Logger . Debug ? . Print ( LogClass . Gpu , $" Recreating view {Info.Width}x{Info.Height} {Info.FormatInfo.Format.ToString()}." ) ;
2020-07-06 22:41:07 -04:00
view . ScaleFactor = scale ;
2021-06-23 19:51:41 -04:00
TextureCreateInfo viewCreateInfo = TextureCache . GetCreateInfo ( view . Info , _context . Capabilities , scale ) ;
2021-03-02 17:30:54 -05:00
ITexture newView = HostTexture . CreateView ( viewCreateInfo , view . FirstLayer - FirstLayer , view . FirstLevel - FirstLevel ) ;
2020-07-06 22:41:07 -04:00
view . ReplaceStorage ( newView ) ;
view . ScaleMode = newScaleMode ;
}
}
if ( ScaleMode ! = newScaleMode )
{
ScaleMode = newScaleMode ;
foreach ( var view in _views )
{
view . ScaleMode = newScaleMode ;
}
}
}
2020-09-10 15:44:04 -04:00
/// <summary>
2020-10-25 16:09:45 -04:00
/// Checks if the memory for this texture was modified, and returns true if it was.
2021-08-20 16:52:09 -04:00
/// The modified flags are optionally consumed as a result.
2020-09-10 15:44:04 -04:00
/// </summary>
2021-08-20 16:52:09 -04:00
/// <param name="consume">True to consume the dirty flags and reprotect, false to leave them as is</param>
2020-09-10 15:44:04 -04:00
/// <returns>True if the texture was modified, false otherwise.</returns>
2021-08-20 16:52:09 -04:00
public bool CheckModified ( bool consume )
2020-09-10 15:44:04 -04:00
{
2021-08-20 16:52:09 -04:00
return Group . CheckDirty ( this , consume ) ;
2020-09-10 15:44:04 -04:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Synchronizes guest and host memory.
/// This will overwrite the texture data with the texture data on the guest memory, if a CPU
/// modification is detected.
/// Be aware that this can cause texture data written by the GPU to be lost, this is just a
/// one way copy (from CPU owned to GPU owned memory).
/// </summary>
2019-10-13 02:02:07 -04:00
public void SynchronizeMemory ( )
{
2020-12-03 14:34:27 -05:00
if ( Target = = Target . TextureBuffer )
2019-10-13 02:02:07 -04:00
{
return ;
}
2021-03-02 17:30:54 -05:00
if ( ! _dirty )
{
return ;
}
_dirty = false ;
2020-07-06 22:41:07 -04:00
if ( _hasData )
2019-10-13 02:02:07 -04:00
{
2021-03-02 17:30:54 -05:00
Group . SynchronizeMemory ( this ) ;
}
else
{
2021-08-20 16:52:09 -04:00
Group . CheckDirty ( this , true ) ;
2021-03-02 17:30:54 -05:00
SynchronizeFull ( ) ;
}
}
2020-07-06 22:41:07 -04:00
2021-03-02 17:30:54 -05:00
/// <summary>
/// Signal that this texture is dirty, indicating that the texture group must be checked.
/// </summary>
public void SignalGroupDirty ( )
{
_dirty = true ;
}
2022-01-09 11:28:48 -05:00
/// <summary>
/// Signal that the modified state is dirty, indicating that the texture group should be notified when it changes.
/// </summary>
public void SignalModifiedDirty ( )
{
_modifiedStale = true ;
}
2021-03-02 17:30:54 -05:00
/// <summary>
2021-04-02 10:33:39 -04:00
/// Fully synchronizes guest and host memory.
2021-03-02 17:30:54 -05:00
/// This will replace the entire texture with the data present in guest memory.
/// </summary>
public void SynchronizeFull ( )
{
if ( _hasData )
{
2020-07-06 22:41:07 -04:00
BlacklistScale ( ) ;
2019-10-13 02:02:07 -04:00
}
2021-06-29 13:32:02 -04:00
ReadOnlySpan < byte > data = _physicalMemory . GetSpan ( Range ) ;
2020-03-19 23:17:11 -04:00
2020-10-25 16:09:45 -04:00
// If the host does not support ASTC compression, we need to do the decompression.
// The decompression is slow, so we want to avoid it as much as possible.
// This does a byte-by-byte check and skips the update if the data is equal in this case.
// This improves the speed on applications that overwrites ASTC data without changing anything.
if ( Info . FormatInfo . Format . IsAstc ( ) & & ! _context . Capabilities . SupportsAstcCompression )
{
if ( _updateCount < ByteComparisonSwitchThreshold )
{
_updateCount + + ;
}
else
{
bool dataMatches = _currentData ! = null & & data . SequenceEqual ( _currentData ) ;
_currentData = data . ToArray ( ) ;
if ( dataMatches )
{
return ;
}
}
}
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
data = ConvertToHostCompatibleFormat ( data ) ;
2020-03-19 23:17:11 -04:00
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
HostTexture . SetData ( data ) ;
2020-03-19 23:17:11 -04:00
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
_hasData = true ;
}
2020-03-19 23:17:11 -04:00
2021-01-17 13:44:34 -05:00
/// <summary>
/// Uploads new texture data to the host GPU.
/// </summary>
/// <param name="data">New data</param>
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
public void SetData ( ReadOnlySpan < byte > data )
{
BlacklistScale ( ) ;
2020-03-19 23:17:11 -04:00
2021-08-20 16:52:09 -04:00
Group . CheckDirty ( this , true ) ;
2020-03-19 23:17:11 -04:00
2022-01-09 11:28:48 -05:00
AlwaysFlushOnOverlap = true ;
2020-09-10 15:44:04 -04:00
2020-03-19 23:17:11 -04:00
HostTexture . SetData ( data ) ;
_hasData = true ;
}
2021-03-02 17:30:54 -05:00
/// <summary>
/// Uploads new texture data to the host GPU for a specific layer/level.
/// </summary>
/// <param name="data">New data</param>
/// <param name="layer">Target layer</param>
/// <param name="level">Target level</param>
public void SetData ( ReadOnlySpan < byte > data , int layer , int level )
{
BlacklistScale ( ) ;
HostTexture . SetData ( data , layer , level ) ;
_currentData = null ;
_hasData = true ;
}
2020-03-19 23:17:11 -04:00
/// <summary>
/// Converts texture data to a format and layout that is supported by the host GPU.
/// </summary>
/// <param name="data">Data to be converted</param>
2022-01-09 11:28:48 -05:00
/// <param name="level">Mip level to convert</param>
/// <param name="single">True to convert a single slice</param>
2020-03-19 23:17:11 -04:00
/// <returns>Converted data</returns>
2021-03-02 17:30:54 -05:00
public ReadOnlySpan < byte > ConvertToHostCompatibleFormat ( ReadOnlySpan < byte > data , int level = 0 , bool single = false )
2020-03-19 23:17:11 -04:00
{
2021-03-02 17:30:54 -05:00
int width = Info . Width ;
int height = Info . Height ;
2022-01-09 11:28:48 -05:00
int depth = _depth ;
2021-03-02 17:30:54 -05:00
int layers = single ? 1 : _layers ;
2022-01-09 11:28:48 -05:00
int levels = single ? 1 : ( Info . Levels - level ) ;
2021-03-02 17:30:54 -05:00
width = Math . Max ( width > > level , 1 ) ;
height = Math . Max ( height > > level , 1 ) ;
depth = Math . Max ( depth > > level , 1 ) ;
2019-12-29 12:41:50 -05:00
if ( Info . IsLinear )
2019-10-13 02:02:07 -04:00
{
data = LayoutConverter . ConvertLinearStridedToLinear (
2021-03-02 17:30:54 -05:00
width ,
height ,
2019-12-29 12:41:50 -05:00
Info . FormatInfo . BlockWidth ,
Info . FormatInfo . BlockHeight ,
Info . Stride ,
2021-12-26 11:05:26 -05:00
Info . Stride ,
2019-12-29 12:41:50 -05:00
Info . FormatInfo . BytesPerPixel ,
2019-10-13 02:02:07 -04:00
data ) ;
}
else
{
data = LayoutConverter . ConvertBlockLinearToLinear (
2021-03-02 17:30:54 -05:00
width ,
height ,
depth ,
2022-01-09 11:28:48 -05:00
single ? 1 : depth ,
2021-03-02 17:30:54 -05:00
levels ,
layers ,
2019-12-29 12:41:50 -05:00
Info . FormatInfo . BlockWidth ,
Info . FormatInfo . BlockHeight ,
Info . FormatInfo . BytesPerPixel ,
Info . GobBlocksInY ,
Info . GobBlocksInZ ,
Info . GobBlocksInTileX ,
2019-10-13 02:02:07 -04:00
_sizeInfo ,
data ) ;
}
2020-11-01 13:32:53 -05:00
// Handle compressed cases not supported by the host:
// - ASTC is usually not supported on desktop cards.
// - BC4/BC5 is not supported on 3D textures.
2021-12-30 11:10:54 -05:00
if ( ! _context . Capabilities . SupportsAstcCompression & & Format . IsAstc ( ) )
2019-10-13 02:02:07 -04:00
{
2021-01-18 22:19:52 -05:00
if ( ! AstcDecoder . TryDecodeToRgba8P (
2019-12-27 01:09:49 -05:00
data . ToArray ( ) ,
2019-12-29 12:41:50 -05:00
Info . FormatInfo . BlockWidth ,
Info . FormatInfo . BlockHeight ,
2021-03-02 17:30:54 -05:00
width ,
height ,
depth ,
levels ,
layers ,
2019-11-08 19:55:53 -05:00
out Span < byte > decoded ) )
{
2019-12-29 12:41:50 -05:00
string texInfo = $"{Info.Target} {Info.FormatInfo.Format} {Info.Width}x{Info.Height}x{Info.DepthOrLayers} levels {Info.Levels}" ;
2019-12-28 18:45:33 -05:00
2021-01-17 13:44:34 -05:00
Logger . Debug ? . Print ( LogClass . Gpu , $"Invalid ASTC texture at 0x{Info.GpuAddress:X} ({texInfo})." ) ;
2019-11-08 19:55:53 -05:00
}
data = decoded ;
2019-10-13 02:02:07 -04:00
}
2021-12-30 11:10:54 -05:00
else if ( ! _context . Capabilities . SupportsR4G4Format & & Format = = Format . R4G4Unorm )
{
data = PixelConverter . ConvertR4G4ToR4G4B4A4 ( data ) ;
}
2022-01-22 13:23:00 -05:00
else if ( ! _context . Capabilities . Supports3DTextureCompression & & Target = = Target . Texture3D )
2020-11-01 13:32:53 -05:00
{
2022-01-22 13:23:00 -05:00
switch ( Format )
{
case Format . Bc1RgbaSrgb :
case Format . Bc1RgbaUnorm :
data = BCnDecoder . DecodeBC1 ( data , width , height , depth , levels , layers ) ;
break ;
case Format . Bc2Srgb :
case Format . Bc2Unorm :
data = BCnDecoder . DecodeBC2 ( data , width , height , depth , levels , layers ) ;
break ;
case Format . Bc3Srgb :
case Format . Bc3Unorm :
data = BCnDecoder . DecodeBC3 ( data , width , height , depth , levels , layers ) ;
break ;
case Format . Bc4Snorm :
case Format . Bc4Unorm :
data = BCnDecoder . DecodeBC4 ( data , width , height , depth , levels , layers , Format = = Format . Bc4Snorm ) ;
break ;
case Format . Bc5Snorm :
case Format . Bc5Unorm :
data = BCnDecoder . DecodeBC5 ( data , width , height , depth , levels , layers , Format = = Format . Bc5Snorm ) ;
break ;
}
2020-11-01 13:32:53 -05:00
}
2019-10-13 02:02:07 -04:00
2020-03-19 23:17:11 -04:00
return data ;
2019-10-13 02:02:07 -04:00
}
2022-01-09 11:28:48 -05:00
/// <summary>
/// Converts texture data from a format and layout that is supported by the host GPU, back into the intended format on the guest GPU.
/// </summary>
/// <param name="output">Optional output span to convert into</param>
/// <param name="data">Data to be converted</param>
/// <param name="level">Mip level to convert</param>
/// <param name="single">True to convert a single slice</param>
/// <returns>Converted data</returns>
public ReadOnlySpan < byte > ConvertFromHostCompatibleFormat ( Span < byte > output , ReadOnlySpan < byte > data , int level = 0 , bool single = false )
{
if ( Target ! = Target . TextureBuffer )
{
int width = Info . Width ;
int height = Info . Height ;
int depth = _depth ;
int layers = single ? 1 : _layers ;
int levels = single ? 1 : ( Info . Levels - level ) ;
width = Math . Max ( width > > level , 1 ) ;
height = Math . Max ( height > > level , 1 ) ;
depth = Math . Max ( depth > > level , 1 ) ;
if ( Info . IsLinear )
{
data = LayoutConverter . ConvertLinearToLinearStrided (
output ,
Info . Width ,
Info . Height ,
Info . FormatInfo . BlockWidth ,
Info . FormatInfo . BlockHeight ,
Info . Stride ,
Info . FormatInfo . BytesPerPixel ,
data ) ;
}
else
{
data = LayoutConverter . ConvertLinearToBlockLinear (
output ,
width ,
height ,
depth ,
single ? 1 : depth ,
levels ,
layers ,
Info . FormatInfo . BlockWidth ,
Info . FormatInfo . BlockHeight ,
Info . FormatInfo . BytesPerPixel ,
Info . GobBlocksInY ,
Info . GobBlocksInZ ,
Info . GobBlocksInTileX ,
_sizeInfo ,
data ) ;
}
}
return data ;
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Flushes the texture data.
/// This causes the texture data to be written back to guest memory.
/// If the texture was written by the GPU, this includes all modification made by the GPU
/// up to this point.
2020-01-01 18:14:18 -05:00
/// Be aware that this is an expensive operation, avoid calling it unless strictly needed.
2019-12-29 18:26:37 -05:00
/// This may cause data corruption if the memory is already being used for something else on the CPU side.
/// </summary>
2020-09-10 15:44:04 -04:00
/// <param name="tracked">Whether or not the flush triggers write tracking. If it doesn't, the texture will not be blacklisted for scaling either.</param>
2022-01-09 11:28:48 -05:00
/// <returns>True if data was flushed, false otherwise</returns>
public bool FlushModified ( bool tracked = true )
2020-03-19 23:17:11 -04:00
{
2022-01-09 11:28:48 -05:00
return TextureCompatibility . CanTextureFlush ( Info , _context . Capabilities ) & & Group . FlushModified ( this , tracked ) ;
2020-03-19 23:17:11 -04:00
}
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
/// <summary>
2022-01-09 11:28:48 -05:00
/// Flushes the texture data.
/// This causes the texture data to be written back to guest memory.
/// If the texture was written by the GPU, this includes all modification made by the GPU
/// up to this point.
/// Be aware that this is an expensive operation, avoid calling it unless strictly needed.
/// This may cause data corruption if the memory is already being used for something else on the CPU side.
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
/// </summary>
2022-01-09 11:28:48 -05:00
/// <param name="tracked">Whether or not the flush triggers write tracking. If it doesn't, the texture will not be blacklisted for scaling either.</param>
public void Flush ( bool tracked )
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
{
2022-01-09 11:28:48 -05:00
if ( TextureCompatibility . CanTextureFlush ( Info , _context . Capabilities ) )
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
{
2022-01-09 11:28:48 -05:00
FlushTextureDataToGuest ( tracked ) ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
}
2022-01-09 11:28:48 -05:00
}
2022-01-11 14:15:17 -05:00
2022-01-09 11:28:48 -05:00
/// <summary>
/// Gets a host texture to use for flushing the texture, at 1x resolution.
/// If the HostTexture is already at 1x resolution, it is returned directly.
/// </summary>
/// <returns>The host texture to flush</returns>
public ITexture GetFlushTexture ( )
{
ITexture texture = HostTexture ;
if ( ScaleFactor ! = 1f )
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
{
2022-01-09 11:28:48 -05:00
// If needed, create a texture to flush back to host at 1x scale.
texture = _flushHostTexture = GetScaledHostTexture ( 1f , _flushHostTexture ) ;
}
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
2022-01-09 11:28:48 -05:00
return texture ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
}
Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494)
* Return mapped buffer pointer directly for flush, WriteableRegion for textures
A few changes here to generally improve performance, even for platforms not using the persistent buffer flush.
- Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again.
- As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time.
- Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion.
Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides.
* Fix tests
* Fix array pointer for Mesa/Intel path
* Address some feedback
* Update method for getting array pointer.
2021-07-19 18:10:54 -04:00
/// <summary>
2022-01-09 11:28:48 -05:00
/// Gets data from the host GPU, and flushes it all to guest memory.
Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494)
* Return mapped buffer pointer directly for flush, WriteableRegion for textures
A few changes here to generally improve performance, even for platforms not using the persistent buffer flush.
- Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again.
- As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time.
- Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion.
Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides.
* Fix tests
* Fix array pointer for Mesa/Intel path
* Address some feedback
* Update method for getting array pointer.
2021-07-19 18:10:54 -04:00
/// </summary>
/// <remarks>
/// This method should be used to retrieve data that was modified by the host GPU.
/// This is not cheap, avoid doing that unless strictly needed.
/// When possible, the data is written directly into guest memory, rather than copied.
/// </remarks>
/// <param name="tracked">True if writing the texture data is tracked, false otherwise</param>
/// <param name="texture">The specific host texture to flush. Defaults to this texture</param>
2022-01-09 11:28:48 -05:00
public void FlushTextureDataToGuest ( bool tracked , ITexture texture = null )
Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494)
* Return mapped buffer pointer directly for flush, WriteableRegion for textures
A few changes here to generally improve performance, even for platforms not using the persistent buffer flush.
- Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again.
- As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time.
- Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion.
Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides.
* Fix tests
* Fix array pointer for Mesa/Intel path
* Address some feedback
* Update method for getting array pointer.
2021-07-19 18:10:54 -04:00
{
2022-01-09 11:28:48 -05:00
using WritableRegion region = _physicalMemory . GetWritableRegion ( Range , tracked ) ;
Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494)
* Return mapped buffer pointer directly for flush, WriteableRegion for textures
A few changes here to generally improve performance, even for platforms not using the persistent buffer flush.
- Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again.
- As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time.
- Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion.
Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides.
* Fix tests
* Fix array pointer for Mesa/Intel path
* Address some feedback
* Update method for getting array pointer.
2021-07-19 18:10:54 -04:00
2022-01-09 11:28:48 -05:00
GetTextureDataFromGpu ( region . Memory . Span , tracked , texture ) ;
Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494)
* Return mapped buffer pointer directly for flush, WriteableRegion for textures
A few changes here to generally improve performance, even for platforms not using the persistent buffer flush.
- Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again.
- As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time.
- Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion.
Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides.
* Fix tests
* Fix array pointer for Mesa/Intel path
* Address some feedback
* Update method for getting array pointer.
2021-07-19 18:10:54 -04:00
}
2020-03-19 23:17:11 -04:00
/// <summary>
/// Gets data from the host GPU.
/// </summary>
/// <remarks>
/// This method should be used to retrieve data that was modified by the host GPU.
/// This is not cheap, avoid doing that unless strictly needed.
/// </remarks>
Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494)
* Return mapped buffer pointer directly for flush, WriteableRegion for textures
A few changes here to generally improve performance, even for platforms not using the persistent buffer flush.
- Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again.
- As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time.
- Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion.
Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides.
* Fix tests
* Fix array pointer for Mesa/Intel path
* Address some feedback
* Update method for getting array pointer.
2021-07-19 18:10:54 -04:00
/// <param name="output">An output span to place the texture data into. If empty, one is generated</param>
/// <param name="blacklist">True if the texture should be blacklisted, false otherwise</param>
/// <param name="texture">The specific host texture to flush. Defaults to this texture</param>
/// <returns>The span containing the texture data</returns>
private ReadOnlySpan < byte > GetTextureDataFromGpu ( Span < byte > output , bool blacklist , ITexture texture = null )
2019-10-13 02:02:07 -04:00
{
2021-07-16 17:10:20 -04:00
ReadOnlySpan < byte > data ;
2020-09-10 15:44:04 -04:00
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
if ( texture ! = null )
2020-09-10 15:44:04 -04:00
{
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
data = texture . GetData ( ) ;
2020-09-10 15:44:04 -04:00
}
else
{
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
if ( blacklist )
{
BlacklistScale ( ) ;
data = HostTexture . GetData ( ) ;
}
else if ( ScaleFactor ! = 1f )
{
float scale = ScaleFactor ;
SetScale ( 1f ) ;
data = HostTexture . GetData ( ) ;
SetScale ( scale ) ;
}
else
{
data = HostTexture . GetData ( ) ;
}
2020-09-10 15:44:04 -04:00
}
2019-12-05 15:34:47 -05:00
2022-01-09 11:28:48 -05:00
data = ConvertFromHostCompatibleFormat ( output , data ) ;
return data ;
}
/// <summary>
/// Gets data from the host GPU for a single slice.
/// </summary>
/// <remarks>
/// This method should be used to retrieve data that was modified by the host GPU.
/// This is not cheap, avoid doing that unless strictly needed.
/// </remarks>
/// <param name="output">An output span to place the texture data into. If empty, one is generated</param>
/// <param name="layer">The layer of the texture to flush</param>
/// <param name="level">The level of the texture to flush</param>
/// <param name="blacklist">True if the texture should be blacklisted, false otherwise</param>
/// <param name="texture">The specific host texture to flush. Defaults to this texture</param>
/// <returns>The span containing the texture data</returns>
public ReadOnlySpan < byte > GetTextureDataSliceFromGpu ( Span < byte > output , int layer , int level , bool blacklist , ITexture texture = null )
{
ReadOnlySpan < byte > data ;
if ( texture ! = null )
2019-12-05 15:34:47 -05:00
{
2022-01-09 11:28:48 -05:00
data = texture . GetData ( layer , level ) ;
}
else
{
if ( blacklist )
2020-11-18 16:17:40 -05:00
{
2022-01-09 11:28:48 -05:00
BlacklistScale ( ) ;
data = HostTexture . GetData ( layer , level ) ;
}
else if ( ScaleFactor ! = 1f )
{
float scale = ScaleFactor ;
SetScale ( 1f ) ;
data = HostTexture . GetData ( layer , level ) ;
SetScale ( scale ) ;
2020-11-18 16:17:40 -05:00
}
else
{
2022-01-09 11:28:48 -05:00
data = HostTexture . GetData ( layer , level ) ;
2020-11-18 16:17:40 -05:00
}
2019-12-05 15:34:47 -05:00
}
2019-10-13 02:02:07 -04:00
2022-01-09 11:28:48 -05:00
data = ConvertFromHostCompatibleFormat ( output , data , level , true ) ;
2020-03-19 23:17:11 -04:00
return data ;
2019-10-13 02:02:07 -04:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
2020-08-31 20:06:27 -04:00
/// This performs a strict comparison, used to check if this texture is equal to the one supplied.
2019-12-29 18:26:37 -05:00
/// </summary>
2020-08-31 20:06:27 -04:00
/// <param name="info">Texture information to compare against</param>
2019-12-29 18:26:37 -05:00
/// <param name="flags">Comparison flags</param>
2020-11-27 13:46:23 -05:00
/// <returns>A value indicating how well this texture matches the given info</returns>
public TextureMatchQuality IsExactMatch ( TextureInfo info , TextureSearchFlags flags )
2019-10-13 02:02:07 -04:00
{
2020-11-27 13:46:23 -05:00
TextureMatchQuality matchQuality = TextureCompatibility . FormatMatches ( Info , info , ( flags & TextureSearchFlags . ForSampler ) ! = 0 , ( flags & TextureSearchFlags . ForCopy ) ! = 0 ) ;
if ( matchQuality = = TextureMatchQuality . NoMatch )
2019-10-13 02:02:07 -04:00
{
2020-11-27 13:46:23 -05:00
return matchQuality ;
2019-10-13 02:02:07 -04:00
}
2020-08-31 20:06:27 -04:00
if ( ! TextureCompatibility . LayoutMatches ( Info , info ) )
2019-10-13 02:02:07 -04:00
{
2020-11-27 13:46:23 -05:00
return TextureMatchQuality . NoMatch ;
2019-10-13 02:02:07 -04:00
}
2021-05-24 03:35:26 -04:00
if ( ! TextureCompatibility . SizeMatches ( Info , info , ( flags & TextureSearchFlags . Strict ) = = 0 , FirstLevel ) )
2019-10-13 02:02:07 -04:00
{
2020-11-27 13:46:23 -05:00
return TextureMatchQuality . NoMatch ;
2019-10-13 02:02:07 -04:00
}
2020-07-13 07:41:30 -04:00
if ( ( flags & TextureSearchFlags . ForSampler ) ! = 0 | | ( flags & TextureSearchFlags . Strict ) ! = 0 )
2019-10-13 02:02:07 -04:00
{
2020-08-31 20:06:27 -04:00
if ( ! TextureCompatibility . SamplerParamsMatches ( Info , info ) )
2019-10-13 02:02:07 -04:00
{
2020-11-27 13:46:23 -05:00
return TextureMatchQuality . NoMatch ;
2019-10-13 02:02:07 -04:00
}
}
2020-07-13 07:41:30 -04:00
if ( ( flags & TextureSearchFlags . ForCopy ) ! = 0 )
2019-10-13 02:02:07 -04:00
{
2019-12-29 12:41:50 -05:00
bool msTargetCompatible = Info . Target = = Target . Texture2DMultisample & & info . Target = = Target . Texture2D ;
2019-10-13 02:02:07 -04:00
2020-08-31 20:06:27 -04:00
if ( ! msTargetCompatible & & ! TextureCompatibility . TargetAndSamplesCompatible ( Info , info ) )
2019-10-13 02:02:07 -04:00
{
2020-11-27 13:46:23 -05:00
return TextureMatchQuality . NoMatch ;
2019-10-13 02:02:07 -04:00
}
}
2020-08-31 20:06:27 -04:00
else if ( ! TextureCompatibility . TargetAndSamplesCompatible ( Info , info ) )
2019-10-13 02:02:07 -04:00
{
2020-11-27 13:46:23 -05:00
return TextureMatchQuality . NoMatch ;
2019-10-13 02:02:07 -04:00
}
2021-01-17 13:44:34 -05:00
return Info . Levels = = info . Levels ? matchQuality : TextureMatchQuality . NoMatch ;
2019-10-13 02:02:07 -04:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Check if it's possible to create a view, with the given parameters, from this texture.
/// </summary>
/// <param name="info">Texture view information</param>
2021-01-17 13:44:34 -05:00
/// <param name="range">Texture view physical memory ranges</param>
2022-01-09 11:28:48 -05:00
/// <param name="layerSize">Layer size on the given texture</param>
/// <param name="caps">Host GPU capabilities</param>
2019-12-29 18:26:37 -05:00
/// <param name="firstLayer">Texture view initial layer on this texture</param>
/// <param name="firstLevel">Texture view first mipmap level on this texture</param>
2020-09-10 15:44:04 -04:00
/// <returns>The level of compatiblilty a view with the given parameters created from this texture has</returns>
2022-01-09 11:28:48 -05:00
public TextureViewCompatibility IsViewCompatible ( TextureInfo info , MultiRange range , int layerSize , Capabilities caps , out int firstLayer , out int firstLevel )
2019-10-13 02:02:07 -04:00
{
2022-01-09 11:28:48 -05:00
TextureViewCompatibility result = TextureViewCompatibility . Full ;
2021-01-17 13:44:34 -05:00
2022-01-09 11:28:48 -05:00
result = TextureCompatibility . PropagateViewCompatibility ( result , TextureCompatibility . ViewFormatCompatible ( Info , info , caps ) ) ;
if ( result ! = TextureViewCompatibility . Incompatible )
2019-10-13 02:02:07 -04:00
{
2022-01-09 11:28:48 -05:00
result = TextureCompatibility . PropagateViewCompatibility ( result , TextureCompatibility . ViewTargetCompatible ( Info , info ) ) ;
2019-10-13 02:02:07 -04:00
2022-01-09 11:28:48 -05:00
if ( result = = TextureViewCompatibility . Full & & Info . FormatInfo . Format ! = info . FormatInfo . Format & & ! _context . Capabilities . SupportsMismatchingViewFormat )
{
// AMD and Intel have a bug where the view format is always ignored;
// they use the parent format instead.
// Create a copy dependency to avoid this issue.
result = TextureViewCompatibility . CopyOnly ;
}
if ( Info . SamplesInX ! = info . SamplesInX | | Info . SamplesInY ! = info . SamplesInY )
{
result = TextureViewCompatibility . Incompatible ;
}
2019-10-13 02:02:07 -04:00
}
2022-01-09 11:28:48 -05:00
firstLayer = 0 ;
firstLevel = 0 ;
if ( result = = TextureViewCompatibility . Incompatible )
2019-10-13 02:02:07 -04:00
{
2020-09-10 15:44:04 -04:00
return TextureViewCompatibility . Incompatible ;
2019-10-13 02:02:07 -04:00
}
2022-01-09 11:28:48 -05:00
int offset = Range . FindOffset ( range ) ;
if ( offset < 0 | | ! _sizeInfo . FindView ( offset , out firstLayer , out firstLevel ) )
{
return TextureViewCompatibility . LayoutIncompatible ;
}
2020-08-31 20:06:27 -04:00
if ( ! TextureCompatibility . ViewLayoutCompatible ( Info , info , firstLevel ) )
2019-10-13 02:02:07 -04:00
{
2022-01-09 11:28:48 -05:00
return TextureViewCompatibility . LayoutIncompatible ;
2019-10-13 02:02:07 -04:00
}
2021-03-02 17:30:54 -05:00
if ( info . GetSlices ( ) > 1 & & LayerSize ! = layerSize )
2019-10-13 02:02:07 -04:00
{
2022-01-09 11:28:48 -05:00
return TextureViewCompatibility . LayoutIncompatible ;
2019-10-13 02:02:07 -04:00
}
2020-09-10 15:44:04 -04:00
result = TextureCompatibility . PropagateViewCompatibility ( result , TextureCompatibility . ViewSizeMatches ( Info , info , firstLevel ) ) ;
2021-03-02 17:30:54 -05:00
result = TextureCompatibility . PropagateViewCompatibility ( result , TextureCompatibility . ViewSubImagesInBounds ( Info , info , firstLayer , firstLevel ) ) ;
2020-09-10 15:44:04 -04:00
2022-01-09 11:28:48 -05:00
return result ;
2020-09-10 15:44:04 -04:00
}
2019-10-13 02:02:07 -04:00
2019-12-29 18:26:37 -05:00
/// <summary>
/// Gets a texture of the specified target type from this texture.
/// This can be used to get an array texture from a non-array texture and vice-versa.
/// If this texture and the requested targets are equal, then this texture Host texture is returned directly.
/// </summary>
/// <param name="target">The desired target type</param>
/// <returns>A view of this texture with the requested target, or null if the target is invalid for this texture</returns>
2019-10-13 02:02:07 -04:00
public ITexture GetTargetTexture ( Target target )
{
2020-12-03 14:34:27 -05:00
if ( target = = Target )
2019-10-13 02:02:07 -04:00
{
return HostTexture ;
}
if ( _arrayViewTexture = = null & & IsSameDimensionsTarget ( target ) )
{
TextureCreateInfo createInfo = new TextureCreateInfo (
2019-12-29 12:41:50 -05:00
Info . Width ,
Info . Height ,
2019-10-13 02:02:07 -04:00
target = = Target . CubemapArray ? 6 : 1 ,
2019-12-29 12:41:50 -05:00
Info . Levels ,
Info . Samples ,
Info . FormatInfo . BlockWidth ,
Info . FormatInfo . BlockHeight ,
Info . FormatInfo . BytesPerPixel ,
Info . FormatInfo . Format ,
Info . DepthStencilMode ,
2019-10-13 02:02:07 -04:00
target ,
2019-12-29 12:41:50 -05:00
Info . SwizzleR ,
Info . SwizzleG ,
Info . SwizzleB ,
Info . SwizzleA ) ;
2019-10-13 02:02:07 -04:00
ITexture viewTexture = HostTexture . CreateView ( createInfo , 0 , 0 ) ;
_arrayViewTexture = viewTexture ;
_arrayViewTarget = target ;
return viewTexture ;
}
else if ( _arrayViewTarget = = target )
{
return _arrayViewTexture ;
}
return null ;
}
2021-12-08 16:09:36 -05:00
/// <summary>
/// Determine if this texture can have anisotropic filtering forced.
/// Filtered textures that we might want to force anisotropy on should have a lot of mip levels.
/// </summary>
/// <returns>True if anisotropic filtering can be forced, false otherwise</returns>
private bool CanTextureForceAnisotropy ( )
{
if ( ! ( Target = = Target . Texture2D | | Target = = Target . Texture2DArray ) )
{
return false ;
}
int maxSize = Math . Max ( Info . Width , Info . Height ) ;
int maxLevels = BitOperations . Log2 ( ( uint ) maxSize ) + 1 ;
return Info . Levels > = Math . Min ( MinLevelsForForceAnisotropy , maxLevels ) ;
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Check if this texture and the specified target have the same number of dimensions.
/// For the purposes of this comparison, 2D and 2D Multisample textures are not considered to have
/// the same number of dimensions. Same for Cubemap and 3D textures.
/// </summary>
/// <param name="target">The target to compare with</param>
/// <returns>True if both targets have the same number of dimensions, false otherwise</returns>
2019-10-13 02:02:07 -04:00
private bool IsSameDimensionsTarget ( Target target )
{
2019-12-29 12:41:50 -05:00
switch ( Info . Target )
2019-10-13 02:02:07 -04:00
{
case Target . Texture1D :
case Target . Texture1DArray :
return target = = Target . Texture1D | |
target = = Target . Texture1DArray ;
case Target . Texture2D :
case Target . Texture2DArray :
return target = = Target . Texture2D | |
target = = Target . Texture2DArray ;
case Target . Cubemap :
case Target . CubemapArray :
return target = = Target . Cubemap | |
target = = Target . CubemapArray ;
case Target . Texture2DMultisample :
case Target . Texture2DMultisampleArray :
return target = = Target . Texture2DMultisample | |
target = = Target . Texture2DMultisampleArray ;
case Target . Texture3D :
return target = = Target . Texture3D ;
}
return false ;
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Replaces view texture information.
/// This should only be used for child textures with a parent.
/// </summary>
/// <param name="parent">The parent texture</param>
/// <param name="info">The new view texture information</param>
/// <param name="hostTexture">The new host texture</param>
2020-07-06 22:41:07 -04:00
/// <param name="firstLayer">The first layer of the view</param>
/// <param name="firstLevel">The first level of the view</param>
public void ReplaceView ( Texture parent , TextureInfo info , ITexture hostTexture , int firstLayer , int firstLevel )
2019-10-13 02:02:07 -04:00
{
2021-03-02 17:30:54 -05:00
IncrementReferenceCount ( ) ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
parent . _viewStorage . SynchronizeMemory ( ) ;
2021-03-02 17:30:54 -05:00
// If this texture has views, they must be given to the new parent.
if ( _views . Count > 0 )
{
Texture [ ] viewCopy = _views . ToArray ( ) ;
foreach ( Texture view in viewCopy )
{
2021-06-23 19:51:41 -04:00
TextureCreateInfo createInfo = TextureCache . GetCreateInfo ( view . Info , _context . Capabilities , ScaleFactor ) ;
2021-03-02 17:30:54 -05:00
2021-04-02 10:33:39 -04:00
ITexture newView = parent . HostTexture . CreateView ( createInfo , view . FirstLayer + firstLayer , view . FirstLevel + firstLevel ) ;
2021-03-02 17:30:54 -05:00
view . ReplaceView ( parent , view . Info , newView , view . FirstLayer + firstLayer , view . FirstLevel + firstLevel ) ;
}
}
2019-10-13 02:02:07 -04:00
ReplaceStorage ( hostTexture ) ;
2021-03-02 17:30:54 -05:00
if ( _viewStorage ! = this )
{
_viewStorage . RemoveView ( this ) ;
}
FirstLayer = parent . FirstLayer + firstLayer ;
FirstLevel = parent . FirstLevel + firstLevel ;
2019-10-13 02:02:07 -04:00
parent . _viewStorage . AddView ( this ) ;
SetInfo ( info ) ;
2021-03-02 17:30:54 -05:00
DecrementReferenceCount ( ) ;
2019-10-13 02:02:07 -04:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Sets the internal texture information structure.
/// </summary>
/// <param name="info">The new texture information</param>
2019-10-13 02:02:07 -04:00
private void SetInfo ( TextureInfo info )
{
2019-12-29 12:41:50 -05:00
Info = info ;
2020-12-03 14:34:27 -05:00
Target = info . Target ;
2022-01-11 14:15:17 -05:00
Width = info . Width ;
Height = info . Height ;
2021-12-08 16:09:36 -05:00
CanForceAnisotropy = CanTextureForceAnisotropy ( ) ;
2019-10-13 02:02:07 -04:00
_depth = info . GetDepth ( ) ;
_layers = info . GetLayers ( ) ;
}
2020-02-06 16:49:26 -05:00
/// <summary>
/// Signals that the texture has been modified.
/// </summary>
public void SignalModified ( )
{
2022-01-09 11:28:48 -05:00
if ( _modifiedStale | | Group . HasCopyDependencies )
2020-09-10 15:44:04 -04:00
{
2022-01-09 11:28:48 -05:00
_modifiedStale = false ;
Group . SignalModified ( this ) ;
2020-09-10 15:44:04 -04:00
}
Lift textures in the AutoDeleteCache for all modifications. (#2615)
* Lift textures in the AutoDeleteCache for all modifications.
Before, this would only apply to render targets and texture blit. Now it applies to image stores, the fast dma copy path and any other type of modification.
Image store always at least has one reference in the texture pool, so the function of the AutoDeleteCache keeping textures _alive_ is not useful, but a very important function for a while has been its use to flush textures in order of modification when they are dereferenced, so that their data is not lost.
Before, textures populated using image stores were being dereferenced and reloaded as garbage. Now, when these textures are dereferenced, their data will be put back into memory, and everything stays intact.
Fixes lighting breaking when switching levels in THPS1+2, and potentially some more UE4 games. I've tested a bunch more games for regressions and performance impact, but they all seem fine.
* Lift copy srcTexture so that it doesn't remain referenceless
* Perform lift before reference count change on unbind.
It's important to lift on unbind as that is the moment the texture was truly last modified, but definitely not after releasing every single reference.
2021-09-11 15:52:54 -04:00
_physicalMemory . TextureCache . Lift ( this ) ;
2021-03-02 17:30:54 -05:00
}
/// <summary>
/// Signals that a texture has been bound, or has been unbound.
/// During this time, lazy copies will not clear the dirty flag.
/// </summary>
/// <param name="bound">True if the texture has been bound, false if it has been unbound</param>
public void SignalModifying ( bool bound )
{
2022-01-09 11:28:48 -05:00
if ( _modifiedStale | | Group . HasCopyDependencies )
2021-03-02 17:30:54 -05:00
{
2022-01-09 11:28:48 -05:00
_modifiedStale = false ;
Group . SignalModifying ( this , bound ) ;
2021-03-02 17:30:54 -05:00
}
2021-04-02 10:33:39 -04:00
Lift textures in the AutoDeleteCache for all modifications. (#2615)
* Lift textures in the AutoDeleteCache for all modifications.
Before, this would only apply to render targets and texture blit. Now it applies to image stores, the fast dma copy path and any other type of modification.
Image store always at least has one reference in the texture pool, so the function of the AutoDeleteCache keeping textures _alive_ is not useful, but a very important function for a while has been its use to flush textures in order of modification when they are dereferenced, so that their data is not lost.
Before, textures populated using image stores were being dereferenced and reloaded as garbage. Now, when these textures are dereferenced, their data will be put back into memory, and everything stays intact.
Fixes lighting breaking when switching levels in THPS1+2, and potentially some more UE4 games. I've tested a bunch more games for regressions and performance impact, but they all seem fine.
* Lift copy srcTexture so that it doesn't remain referenceless
* Perform lift before reference count change on unbind.
It's important to lift on unbind as that is the moment the texture was truly last modified, but definitely not after releasing every single reference.
2021-09-11 15:52:54 -04:00
_physicalMemory . TextureCache . Lift ( this ) ;
2021-04-02 10:33:39 -04:00
if ( bound )
{
IncrementReferenceCount ( ) ;
}
else
{
DecrementReferenceCount ( ) ;
}
2020-02-06 16:49:26 -05:00
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Replaces the host texture, while disposing of the old one if needed.
/// </summary>
/// <param name="hostTexture">The new host texture</param>
2019-10-13 02:02:07 -04:00
private void ReplaceStorage ( ITexture hostTexture )
{
DisposeTextures ( ) ;
HostTexture = hostTexture ;
}
2021-08-29 15:22:13 -04:00
/// <summary>
/// Determine if any of this texture's data overlaps with another.
/// </summary>
/// <param name="texture">The texture to check against</param>
2022-01-11 03:37:40 -05:00
/// <param name="compatibility">The view compatibility of the two textures</param>
2021-08-29 15:22:13 -04:00
/// <returns>True if any slice of the textures overlap, false otherwise</returns>
2022-01-11 03:37:40 -05:00
public bool DataOverlaps ( Texture texture , TextureViewCompatibility compatibility )
2021-08-29 15:22:13 -04:00
{
2022-01-11 03:37:40 -05:00
if ( compatibility = = TextureViewCompatibility . LayoutIncompatible & & Info . GobBlocksInZ > 1 & & Info . GobBlocksInZ = = texture . Info . GobBlocksInZ )
{
// Allow overlapping slices of layout compatible 3D textures with matching GobBlocksInZ, as they are interleaved.
return false ;
}
2021-08-29 15:22:13 -04:00
if ( texture . _sizeInfo . AllOffsets . Length = = 1 & & _sizeInfo . AllOffsets . Length = = 1 )
{
return Range . OverlapsWith ( texture . Range ) ;
}
MultiRange otherRange = texture . Range ;
IEnumerable < MultiRange > regions = _sizeInfo . AllRegions ( ) . Select ( ( region ) = > Range . GetSlice ( ( ulong ) region . Offset , ( ulong ) region . Size ) ) ;
IEnumerable < MultiRange > otherRegions = texture . _sizeInfo . AllRegions ( ) . Select ( ( region ) = > otherRange . GetSlice ( ( ulong ) region . Offset , ( ulong ) region . Size ) ) ;
foreach ( MultiRange region in regions )
{
foreach ( MultiRange otherRegion in otherRegions )
{
if ( region . OverlapsWith ( otherRegion ) )
{
return true ;
}
}
}
return false ;
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Increments the texture reference count.
/// </summary>
2019-10-13 02:02:07 -04:00
public void IncrementReferenceCount ( )
{
_referenceCount + + ;
}
2021-03-06 09:43:55 -05:00
/// <summary>
/// Increments the reference count and records the given texture pool and ID as a pool owner.
/// </summary>
/// <param name="pool">The texture pool this texture has been added to</param>
/// <param name="id">The ID of the reference to this texture in the pool</param>
public void IncrementReferenceCount ( TexturePool pool , int id )
{
lock ( _poolOwners )
{
_poolOwners . Add ( new TexturePoolOwner { Pool = pool , ID = id } ) ;
}
_referenceCount + + ;
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Decrements the texture reference count.
/// When the reference count hits zero, the texture may be deleted and can't be used anymore.
/// </summary>
2020-09-10 15:44:04 -04:00
/// <returns>True if the texture is now referenceless, false otherwise</returns>
public bool DecrementReferenceCount ( )
2019-10-13 02:02:07 -04:00
{
2019-10-30 19:45:01 -04:00
int newRefCount = - - _referenceCount ;
if ( newRefCount = = 0 )
2019-10-13 02:02:07 -04:00
{
if ( _viewStorage ! = this )
{
_viewStorage . RemoveView ( this ) ;
}
2021-06-29 13:32:02 -04:00
_physicalMemory . TextureCache . RemoveTextureFromCache ( this ) ;
2019-10-30 19:45:01 -04:00
}
Debug . Assert ( newRefCount > = 0 ) ;
2019-10-13 02:02:07 -04:00
2019-10-30 19:45:01 -04:00
DeleteIfNotUsed ( ) ;
2020-09-10 15:44:04 -04:00
return newRefCount < = 0 ;
2019-10-30 19:45:01 -04:00
}
2021-03-06 09:43:55 -05:00
/// <summary>
/// Decrements the texture reference count, also removing an associated pool owner reference.
/// When the reference count hits zero, the texture may be deleted and can't be used anymore.
/// </summary>
/// <param name="pool">The texture pool this texture is being removed from</param>
/// <param name="id">The ID of the reference to this texture in the pool</param>
/// <returns>True if the texture is now referenceless, false otherwise</returns>
public bool DecrementReferenceCount ( TexturePool pool , int id = - 1 )
{
lock ( _poolOwners )
{
int references = _poolOwners . RemoveAll ( entry = > entry . Pool = = pool & & entry . ID = = id | | id = = - 1 ) ;
if ( references = = 0 )
{
// This reference has already been removed.
return _referenceCount < = 0 ;
}
Debug . Assert ( references = = 1 ) ;
}
return DecrementReferenceCount ( ) ;
}
/// <summary>
/// Forcibly remove this texture from all pools that reference it.
/// </summary>
/// <param name="deferred">Indicates if the removal is being done from another thread.</param>
public void RemoveFromPools ( bool deferred )
{
lock ( _poolOwners )
{
foreach ( var owner in _poolOwners )
{
owner . Pool . ForceRemove ( this , owner . ID , deferred ) ;
}
_poolOwners . Clear ( ) ;
}
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Delete the texture if it is not used anymore.
/// The texture is considered unused when the reference count is zero,
/// and it has no child views.
/// </summary>
2019-10-30 19:45:01 -04:00
private void DeleteIfNotUsed ( )
{
// We can delete the texture as long it is not being used
// in any cache (the reference count is 0 in this case), and
// also all views that may be created from this texture were
// already deleted (views count is 0).
if ( _referenceCount = = 0 & & _views . Count = = 0 )
{
2020-07-06 22:41:07 -04:00
Dispose ( ) ;
2019-10-13 02:02:07 -04:00
}
}
2019-12-29 18:26:37 -05:00
/// <summary>
/// Performs texture disposal, deleting the texture.
/// </summary>
2019-10-13 02:02:07 -04:00
private void DisposeTextures ( )
{
2020-10-25 16:09:45 -04:00
_currentData = null ;
2020-09-10 15:44:04 -04:00
HostTexture . Release ( ) ;
2019-10-13 02:02:07 -04:00
2020-09-10 15:44:04 -04:00
_arrayViewTexture ? . Release ( ) ;
2019-10-13 02:02:07 -04:00
_arrayViewTexture = null ;
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 16:18:35 -04:00
_flushHostTexture ? . Release ( ) ;
_flushHostTexture = null ;
2019-10-13 02:02:07 -04:00
}
2019-12-31 17:09:49 -05:00
2020-09-10 15:44:04 -04:00
/// <summary>
/// Called when the memory for this texture has been unmapped.
/// Calls are from non-gpu threads.
/// </summary>
2022-01-09 11:28:48 -05:00
/// <param name="unmapRange">The range of memory being unmapped</param>
public void Unmapped ( MultiRange unmapRange )
2020-09-10 15:44:04 -04:00
{
2021-03-06 09:43:55 -05:00
ChangedMapping = true ;
2022-01-09 11:28:48 -05:00
if ( Group . Storage = = this )
{
Group . ClearModified ( unmapRange ) ;
}
2021-03-06 09:43:55 -05:00
RemoveFromPools ( true ) ;
2020-09-10 15:44:04 -04:00
}
2019-12-31 17:09:49 -05:00
/// <summary>
/// Performs texture disposal, deleting the texture.
/// </summary>
public void Dispose ( )
{
DisposeTextures ( ) ;
2020-07-06 22:41:07 -04:00
Disposed ? . Invoke ( this ) ;
2021-03-02 17:30:54 -05:00
if ( Group . Storage = = this )
{
Group . Dispose ( ) ;
}
2019-12-31 17:09:49 -05:00
}
2019-10-13 02:02:07 -04:00
}
}