This was the result of a typo accidentally introduced in
e51d715700. This restores the previous
correct behavior.
The behavior with the reference was incorrect and would cause some games
to fail to boot.
Conceptually, it doesn't make sense for a thread to be able to persist
the lifetime of a scheduler. A scheduler should be taking care of the
threads; the threads should not be taking care of the scheduler.
If the threads outlive the scheduler (or we simply don't actually
terminate/shutdown the threads), then it should be considered a bug
that we need to fix.
Attributing this to balika011, as they opened #1317 to attempt to fix
this in a similar way, but my refactoring of the kernel code caused
quite a few conflicts.
Many of the member variables of the thread class aren't even used
outside of the class itself, so there's no need to make those variables
public. This change follows in the steps of the previous changes that
made other kernel types' members private.
The main motivation behind this is that the Thread class will likely
change in the future as emulation becomes more accurate, and letting
random bits of the emulator access data members of the Thread class
directly makes it a pain to shuffle around and/or modify internals.
Having all data members public like this also makes it difficult to
reason about certain bits of behavior without first verifying what parts
of the core actually use them.
Everything being public also generally follows the tendency for changes
to be introduced in completely different translation units that would
otherwise be better introduced as an addition to the Thread class'
public interface.
Now that we have all of the rearranging and proper structure sizes in
place, it's fairly trivial to implement svcGetThreadContext(). In the
64-bit case we can more or less just write out the context as is, minus
some minor value sanitizing. In the 32-bit case we'll need to clear out
the registers that wouldn't normally be accessible from a 32-bit
AArch32 exectuable (or process).
This will be necessary for the implementation of svcGetThreadContext(),
as the kernel checks whether or not the process that owns the thread
that has it context being retrieved is a 64-bit or 32-bit process.
If the process is 32-bit, then the upper 15 general-purpose registers
and upper 16 vector registers are cleared to zero (as AArch32 only has
15 GPRs and 16 128-bit vector registers. not 31 general-purpose
registers and 32 128-bit vector registers like AArch64).
Makes the public interface consistent in terms of how accesses are done
on a process object. It also makes it slightly nicer to reason about the
logic of the process class, as we don't want to expose everything to
external code.
boost::static_pointer_cast for boost::intrusive_ptr (what SharedPtr is),
takes its parameter by const reference. Given that, it means that this
std::move doesn't actually do anything other than obscure what the
function's actual behavior is, so we can remove this. To clarify, this
would only do something if the parameter was either taking its argument
by value, by non-const ref, or by rvalue-reference.
The locations of these can actually vary depending on the address space
layout, so we shouldn't be using these when determining where to map
memory or be using them as offsets for calculations. This keeps all the
memory ranges flexible and malleable based off of the virtual memory
manager instance state.
Previously, these were reporting hardcoded values, but given the regions
can change depending on the requested address spaces, these need to
report the values that the memory manager contains.
Rather than hard-code the address range to be 36-bit, we can derive the
parameters from supplied NPDM metadata if the supplied exectuable
supports it. This is the bare minimum necessary for this to be possible.
The following commits will rework the memory code further to adjust to
this.
The owning process of a thread is required to exist before the thread,
so we can enforce this API-wise by using a reference. We can also avoid
the reliance on the system instance by using that parameter to access
the page table that needs to be set.
This can just be a regular function, getting rid of the need to also
explicitly undef the define at the end of the file. Given FuncReturn()
was already converted into a function, it's #undef can also be removed.
This modifies the CPU interface to more accurately match an
AArch64-supporting CPU as opposed to an ARM11 one. Two of the methods
don't even make sense to keep around for this interface, as Adv Simd is
used, rather than the VFP in the primary execution state. This is
essentially a modernization change that should have occurred from the
get-go.
The kernel does the equivalent of the following check before proceeding:
if (address + 0x8000000000 < 0x7FFFE00000) {
return ERR_INVALID_MEMORY_STATE;
}
which is essentially what our IsKernelVirtualAddress() function does. So
we should also be checking for this.
The kernel also checks if the given input addresses are 4-byte aligned,
however our Mutex::TryAcquire() and Mutex::Release() functions already
handle this, so we don't need to add code for this case.
The kernel caps the size limit of shared memory to 8589930496 bytes (or
(1GB - 512 bytes) * 8), so approximately 8GB, where every GB has a 512
byte sector taken off of it.
It also ensures the shared memory is created with either read or
read/write permissions for both permission types passed in, allowing the
remote permissions to also be set as "don't care".
Part of the checking done by the kernel is to check if the given
address and size are 4KB aligned, as well as checking if the size isn't
zero. It also only allows mapping shared memory as readable or
read/write, but nothing else, and so we shouldn't allow mapping as
anything else either.
Previously, these were sitting outside of the Kernel namespace, which
doesn't really make sense, given they're related to the Thread class
which is within the Kernel namespace.
While unlikely, it does avoid constructing a std::string and
unnecessarily calling into the memory code if a game or executable
decides to be really silly about their logging.
Given we now have the kernel as a class, it doesn't make sense to keep
the current process pointer within the System class, as processes are
related to the kernel.
This also gets rid of a subtle case where memory wouldn't be freed on
core shutdown, as the current_process pointer would never be reset,
causing the pointed to contents to continue to live.
Now that we have a class representing the kernel in some capacity, we
now have a place to put the named port map, so we move it over and get
rid of another piece of global state within the core.
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.
This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.
This should make turnaround for changes much faster for developers.
As means to pave the way for getting rid of global state within core,
This eliminates kernel global state by removing all globals. Instead
this introduces a KernelCore class which acts as a kernel instance. This
instance lives in the System class, which keeps its lifetime contained
to the lifetime of the System class.
This also forces the kernel types to actually interact with the main
kernel instance itself instead of having transient kernel state placed
all over several translation units, keeping everything together. It also
has a nice consequence of making dependencies much more explicit.
This also makes our initialization a tad bit more correct. Previously we
were creating a kernel process before the actual kernel was initialized,
which doesn't really make much sense.
The KernelCore class itself follows the PImpl idiom, which allows
keeping all the implementation details sealed away from everything else,
which forces the use of the exposed API and allows us to avoid any
unnecessary inclusions within the main kernel header.
We can make this error code an alias of the resource limit exceeded
error code, allowing us to get rid of the lingering 3DS error code of
the same type.
We already have the variable itself set up to perform this task, so we
can just return its value from the currently executing process instead
of always stubbing it to zero.
Allows querying the inverse of IsDomain() to make things more readable.
This will likely also be usable in the event of implementing
ConvertDomainToSession().
Despite being covered by a global mutex, we should still ensure that the
class handles its reference counts properly. This avoids potential
shenanigans when it comes to data races.
Given this is the root object that drives quite a bit of the kernel
object hierarchy, ensuring we always have the correct behavior (and no
races) is a good thing.
The current core may have nothing to do with the core where the new thread was scheduled to run. In case it's the same core, then the following PrepareReshedule call will take care of that.
WakeAfterDelay might be called from any host thread, so err on the side of caution and use the thread-safe CoreTiming::ScheduleEventThreadsafe.
Note that CoreTiming is still far from thread-safe, there may be more things we have to work on for it to be up to par with what we want.
Exit from AddMutexWaiter early if the thread is already waiting for a mutex owned by the owner thread.
This accounts for the possibility of a thread that is waiting on a condition variable being awakened twice in a row.
Also added more validation asserts.
This should fix one of the random crashes in Breath Of The Wild.
These members don't need to be entirely exposed, we can instead expose
an API to operate on them without directly needing to mutate them
We can also guard against overflow/API misuse this way as well, given
active_sessions is an unsigned value.
This amends cases where crashes can occur that were missed due to the
odd way the previous code was set up (using 3DS memory regions that
don't exist).
Using member variables for referencing the segments array increases the
size of the class in memory for little benefit. The same behavior can be
achieved through the use of accessors that just return the relevant
segment.
Avoids using a u32 to compare against a range of size_t, which can be a
source of warnings. While we're at it, compress a std::tie into a
structured binding.
General moving to keep kernel object types separate from the direct
kernel code. Also essentially a preliminary cleanup before eliminating
global kernel state in the kernel code.
This introduces a slightly more generic variant of WriteBuffer().
Notably, this variant doesn't constrain the arguments to only accepting
std::vector instances. It accepts whatever adheres to the
ContiguousContainer concept in the C++ standard library.
This essentially means, std::array, std::string, and std::vector can be
used directly with this interface. The interface no longer forces you to
solely use containers that dynamically allocate.
To ensure our overloads play nice with one another, we only enable the
container-based WriteBuffer if the argument is not a pointer, otherwise
we fall back to the pointer-based one.
The reason this would never be true is that ideal_processor is a u8 and
THREADPROCESSORID_DEFAULT is an s32. In this case, it boils down to how
arithmetic conversions are performed before performing the comparison.
If an unsigned value has a lesser conversion rank (aka smaller size)
than the signed type being compared, then the unsigned value is promoted
to the signed value (i.e. u8 -> s32 happens before the comparison). No
sign-extension occurs here either.
An alternative phrasing:
Say we have a variable named core and it's given a value of -2.
u8 core = -2;
This becomes 254 due to the lack of sign. During integral promotion to
the signed type, this still remains as 254, and therefore the condition
will always be true, because no matter what value the u8 is given it
will never be -2 in terms of 32 bits.
Now, if one type was a s32 and one was a u32, this would be entirely
different, since they have the same bit width (and the signed type would
be converted to unsigned instead of the other way around) but would
still have its representation preserved in terms of bits, allowing the
comparison to be false in some cases, as opposed to being true all the
time.
---
We also get rid of two signed/unsigned comparison warnings while we're
at it.
Previously, the buffer_index parameter was unused, causing all writes to
use the buffer index of zero, which is not necessarily what is wanted
all the time.
Thankfully, all current usages don't use a buffer index other than zero,
so this just prevents a bug before it has a chance to spring.
This would result in a lot of allocations and related object
construction, just to toss it all away immediately after the call.
These are definitely not intentional, and it was intended that all of
these should have been accessing the static function GetInstance()
through the name itself, not constructed instances.
This situation may happen like so:
Thread 1 with low priority calls WaitProcessWideKey with timeout.
Thread 2 with high priority calls WaitProcessWideKey without timeout.
Thread 3 calls SignalProcessWideKey
- Thread 2 acquires the lock and awakens.
- Thread 1 can't acquire the lock and is put to sleep with the lock owner being Thread 2.
Thread 1's timeout expires, with the lock owner still being set to Thread 2.
This makes the formatting expectations more obvious (e.g. any zero padding specified
is padding that's entirely dedicated to the value being printed, not any pretty-printing
that also gets tacked on).
* GetSharedFontInOrderOfPriority
* Update pl_u.cpp
* Ability to use ReadBuffer and WriteBuffer with different buffer indexes, fixed up GetSharedFontInOrderOfPriority
* switched to NGLOG
* Update pl_u.cpp
* Update pl_u.cpp
* language_code is actually language code and not index
* u32->u64
* final cleanups
Verified with a hwtest and implemented based on reverse engineering.
Thread A's priority will get bumped to the highest priority among all the threads that are waiting for a mutex that A holds.
Once A releases the mutex and ownership is transferred to B, A's priority will return to normal and B's priority will be bumped.
Switch mutexes are no longer kernel objects, they are managed in userland and only use the kernel to handle the contention case.
Mutex addresses store a special flag value (0x40000000) to notify the guest code that there are still some threads waiting for the mutex to be released. This flag is updated when a thread calls ArbitrateUnlock.
TODO:
* Fix svcWaitProcessWideKey
* Fix svcSignalProcessWideKey
* Remove the Mutex class.
* Updated ACC with more service names
* Updated SVC with more service names
* Updated set with more service names
* Updated sockets with more service names
* Updated SPL with more service names
* Updated time with more service names
* Updated vi with more service names
Ported from citra PR #3091
The delay specified here is from a Nintendo 3DS, and should be measured in a Nintendo Switch.
This change is enough to prevent Puyo Puyo Tetris's main thread starvation.