Difference between revisions of "Creating Multithreaded Skyrim Mods"
imported>Chesko |
imported>Chesko |
||
Line 29: | Line 29: | ||
If you answered "yes" to any of these bullets, a multithreaded design pattern may increase the performance of your mod. This pattern provides two distinct advantages: | If you answered "yes" to any of these bullets, a multithreaded design pattern may increase the performance of your mod. This pattern provides two distinct advantages: | ||
# Multithreading | # Multithreading takes tasks that would otherwise be run in sequence and allows them to run simultaneously, which can reduce the time it takes to complete all tasks. | ||
# Due to the way the Papyrus scheduler must sync external calls to frames, many external calls can add a great deal of overhead (see [http://www.creationkit.com/Threading_Notes_%28Papyrus%29 this page] on notes regarding how external calls suspend and resume threads); this pattern can greatly reduce the number of external function calls | # Due to the way the Papyrus scheduler must sync external calls to frames, many external calls can add a great deal of overhead (see [http://www.creationkit.com/Threading_Notes_%28Papyrus%29 this page] on notes regarding how external calls suspend and resume threads); this pattern can greatly reduce the number of external function calls your script must use at any one time. | ||
Only profiling your scripts by using one of the various profiling functions can tell you whether or not these patterns will improve your mod's behavior. I have personally seen performance over 10 times faster (an action that once took ~8.5 seconds to now takes ~0.5 seconds) in my own mods using this method. | |||
Note that by spinning up many threads simultaneously, you are invariably placing increased load on the Papyrus VM. You must decide whether or not the narrow "spike" of resource consumption using threads is better than the more spread-out "swell" of a single thread calling many functions back-to-back. Again, '''profile''' before and after! | |||
{{WarningBox|Keep in mind that asynchronous operations means that you don't know how fast, or in what order, your threads will run or finish. | {{WarningBox|Keep in mind that asynchronous operations means that you don't know how fast, or in what order, your threads will run or finish. |
Revision as of 11:29, 19 January 2015
Under Construction.
This tutorial covers how modders can use Papyrus more effectively by leveraging its inherent multithreading capability. This guide includes plenty of examples and explanations to help you understand the design pattern. Using multithreading can greatly increase the performance of complex or repetitive Papyrus calculations, reduce overall strain on the Papyrus VM (making you a better mod-citizen), and help execute time-sensitive tasks.
Papyrus is a threaded scripting language. However, it can be a challenge harness this attribute of the language.
The intended audience for this guide is intermediate to expert Papyrus developers. This design pattern requires SKSE for its use of Mod Events.
The examples provided are intended to be used as a reference to adapt to your own needs; as each mod's needs are different, and because of the way Papyrus (and Skyrim) is designed, writing a generic framework that provides a solution for everyone is not possible; change it to fit your unique requirements.
Please take your time going through this guide; there is a lot of information, but once you've grasped the idea, you'll be up and running in no time. There are a lot of codependencies, so some of what you'll be doing may not make sense until the end.
Should I Multithread?
The first question to answer is whether or not a multithreading solution is a good fit for your mod. There's no sense in refactoring hundreds of lines of code if you're not going to stand to benefit from it.
Does your mod / script:
- Have many external function calls to complete a single task?
- Have many objects that must be placed quickly using things like
MoveTo()
? - Extensively or repeatedly use latent functions?
- Have time-critical tasks that rely on the results of other (potentially slow) functions?
- Have need of doing the same thing to a large group of objects?
If you answered "yes" to any of these bullets, a multithreaded design pattern may increase the performance of your mod. This pattern provides two distinct advantages:
- Multithreading takes tasks that would otherwise be run in sequence and allows them to run simultaneously, which can reduce the time it takes to complete all tasks.
- Due to the way the Papyrus scheduler must sync external calls to frames, many external calls can add a great deal of overhead (see this page on notes regarding how external calls suspend and resume threads); this pattern can greatly reduce the number of external function calls your script must use at any one time.
Only profiling your scripts by using one of the various profiling functions can tell you whether or not these patterns will improve your mod's behavior. I have personally seen performance over 10 times faster (an action that once took ~8.5 seconds to now takes ~0.5 seconds) in my own mods using this method.
Note that by spinning up many threads simultaneously, you are invariably placing increased load on the Papyrus VM. You must decide whether or not the narrow "spike" of resource consumption using threads is better than the more spread-out "swell" of a single thread calling many functions back-to-back. Again, profile before and after!
Key Terms
- Thread - An individual script instance that does work. Returns results to a
Future
or raises a callback event (depending on the pattern used). - Thread Manager - A script that controls which thread handles a task. Returns a
Future
to the user of the script, if using that pattern. - Callback (Callback pattern) - A Mod Event that is raised when a thread completes. Passes the thread results in the event parameters.
- Future (Future pattern) - An object that will contain the result of a thread's operation at some point in the future.
- Future Anchor (Future pattern) - An object reference in an unused cell that we use to create Futures with using
PlaceAtMe()
.
A Private Army
In this example, we are developing a Conjuration mod. We need to spawn 20 guards very quickly when the player casts a spell; ideally, they should all appear at close to the same time. We also need to keep track of the guards we create, so we can destroy them after the spell ends. This guide will not cover creating a spell, instead we will skip to a point after we've created our Spell and our MagicEffect that we want to add a script to.
We come up with the following script to drop onto our MagicEffect in the Creation Kit when our spell is cast:
scriptname SummonArmy extends ActiveMagicEffect
ActorBase property Guard auto
ObjectReference property GuardMarker auto
Actor Guard1
Actor Guard2
...
Actor Guard20
Event OnEffectStart(Actor akTarget, Actor akCaster)
if akCaster == Game.GetPlayer()
;Place actors according to the player's position, taking into account walls, obstacles, etc
MoveGuardMarkerNearPlayer(1) ;Moves the GuardMarker where the guard is supposed to go; maybe some GetPositions, etc
Guard1 = GuardMarker.PlaceAtMe(Guard)
MoveGuardMarkerNearPlayer(2)
Guard2 = GuardMarker.PlaceAtMe(Guard)
;...and so on
MoveGuardMarkerNearPlayer(20)
Guard20 = GuardMarker.PlaceAtMe(Guard)
endif
endEvent
Event OnEffectFinish(Actor akTarget, Actor akCaster)
if akCaster == Game.GetPlayer()
Guard1.Disable()
Guard1.Delete()
;...and so on
Guard20.Disable()
Guard20.Delete()
endif
endEvent
We test this in-game, and we see each guard appear one-by-one. Kinda lame. You decide to upload it anyway, and your users complain that the spell is "slow" and "clunky".
PlaceAtMe()
can be slow, especially over this many objects. We also have (for illustration purposes) some preprocessing that needs to happen (MoveGuardMarkerNearPlayer(int Index)
) before we know where to put the guard. We have decided that multithreading this task would be much faster than placing each Actor one-by-one.
Creation Kit
- Create Quest: Begin by opening the Creation Kit and creating a new Quest. We'll call our quest GuardPlacementQuest. Click OK to save and close the quest, then open it again (to prevent the CK from crashing). Make sure that "Start Game Enabled", "Run Once", "Warn on Alias Failure" and "Allow repeated stages" are unchecked. Click OK to close it again.
- Create Future (Activator): Next, we want to create an object we will need later, called a
Future
. We'll get into what these do later. Open the Activator tree in the Creation Kit Object Window, and find 'xMarkerActivator'. Right click and Duplicate this object. Double-click the duplicate and rename it's Editor ID to identify it later; we'll call ours GuardPlacementFutureActivator. - Create Anchor (Object Reference): We now want to create a "Future Anchor". This is an XMarker object reference that we will be placing in a far-off, unused cell. You can create your own blank cell, but AAADeleteWhenDoneTestJeremy is also a good candidate. Wherever you decide to place it, drag an XMarker Static from the Object Window of the Creation Kit into the Render Window and name the reference. We'll name ours GuardPlacementFutureAnchor. We'll use this to
PlaceAtMe()
Futures on this object later on.
Threads
The thread is what will perform the work we want to perform in parallel. Just like the PlaceAtMe()
needed to spawn our guards, we expect the result of our Thread to be an ObjectReference.
First, let's define a base Thread "class", called GuardPlacementThread.
scriptname GuardPlacementThread extends Quest hidden
;UNCOMMENT THIS after compiling Thread Manager
;import GuardPlacementThreadManager
;Thread control variables
ObjectReference future
int thread_id = -1
bool thread_queued = false
;Variables you need to get things done go here
ActorBase theGuard
Static theMarker
ObjectReference function get_async(Activator akFuture, ObjectReference akFutureAnchor, ActorBase akGuard, Static akXMarker)
;Let the Thread Manager know that this thread is busy
thread_queued = true
;If our thread doesn't have a unique ID, request one from Thread Manager
if thread_id == -1
;UNCOMMENT THIS after compiling Thread Manager
;thread_id = GetThreadId()
endif
;Store our passed-in parameters to member variables
theGuard = akGuard
theMarker = akXMarker
;Register for the event that will start our thread
RegisterForModEvent("MyMod_OnGuardPlacement")
;Raise the event that will start this thread
RaiseEvent_OnGuardPlacement(thread_id)
;Create the Future that will contain our result
future = akFutureAnchor.PlaceAtMe(akFuture)
return future
endFunction
function RaiseEvent_OnGuardPlacement(int iThreadId)
int handle = ModEvent.Create("MyMod_OnGuardPlacement")
if handle
ModEvent.PushInt(handle, iThreadId)
ModEvent.Send(handle)
else
;pass
endif
endFunction
bool function busy()
return thread_queued
endFunction
bool function has_future(ObjectReference akFuture)
if akFuture == future
return true
else
return false
endif
endFunction
bool function force_unlock()
clear_thread_vars()
thread_queued = false
return true
endFunction
Event OnGuardPlacement(int aiThreadId)
if thread_queued && aiThreadId == thread_id
;OK, let's get some work done!
ObjectReference tempMarker = Game.GetPlayer().PlaceAtMe(theMarker) ;We could have passed PlayerRef in as a get_async() parameter, too
MoveGuardMarkerNearPlayer(tempMarker)
ObjectReference result = tempMarker.PlaceAtMe(theGuard)
(future as GuardPlacementFuture).result = result
clear_thread_vars()
thread_queued = false
endif
endEvent
function clear_thread_vars()
;Reset all thread variables to default state
theGuard = None
theMarker = None
endFunction
function MoveGuardMarkerNearPlayer(ObjectReference akMarker)
;Expensive SetPosition, GetPosition, FindNearestRef, etc calls here (illustration only)
endFunction
As you can see, our thread does a few important things:
- It has a
get_async()
function, which takes in all of the parameters necessary to do the work we need to perform.
- It grabs a unique
thread_id
if it doesn't already have one, which is used to act only on an Event raised by this thread.
- It defines and registers for an
OnGuardPlacement
Event, which is a custom Mod Event that does the work.
get_async()
returns aFuture
back to the Thread Manager (who will in turn give theFuture
back to our script).
- It does some work in the Event, but only if the thread has been 'queued' and the ThreadId of the Event matches ours.
- We return our results back to the
Future
we created.
- We clear all of our member variables using
clear_thread_vars()
.
- We set
thread_queued
back toFalse
, which tells the Thread Manager that this thread is available to be used again.
Now that we've set up our base Thread script, we will create 10 child scripts that will extend this one. They will each contain only one line, the scriptname definition.
;GuardPlacementThread01.psc
scriptname GuardPlacementThread01 extends GuardPlacementThread
;GuardPlacementThread02.psc
scriptname GuardPlacementThread02 extends GuardPlacementThread
...
;GuardPlacementThread09.psc
scriptname GuardPlacementThread09 extends GuardPlacementThread
;GuardPlacementThread10.psc
scriptname GuardPlacementThread10 extends GuardPlacementThread
Once all of your Thread child scripts and your base Thread script are saved and compiled, attach the 10 child scripts to your Quest.
"But wait," you ask. "We need to place 20 guards, but we only have 10 threads. Won't something break?" The Thread Manager, which we'll talk about next, can handle having more work than there are threads!
Thread Manager
We will next define the Thread Manager script. This script handles delegating our work to an available thread. If a thread is not available, it waits until one is.
Since we may have many thread scripts, and it would be tedious to hook up properties we need to do our task in each and every one, define them here instead and we will pass them in as parameters to our threads. We will try to keep properties that have to be hooked up in the Creation Kit off of the threads themselves.
In the end, the function that we call in our Thread Manager will return a Future
, which we can use to get our return value later.
scriptname GuardPlacementThreadManager extends Quest
Quest property GuardPlacementQuest auto
{The name of the thread management quest.}
Activator property GuardPlacementFutureActivator auto
{Our Future object.}
ObjectReference property GuardPlacementFutureAnchor auto
{Our Future Anchor object reference.}
Static property XMarker auto
{Tedious to define properties in the threads and hook up in CK over and over, so define things we need here. MoveGuardMarkerNearPlayer() needs XMarkers.}
int next_thread_id = 0
function GetThreadId() global
next_thread_id += 1
return next_thread_id
endFunction
;Let's cast our threads to local variables so things are less cluttered in our code
GuardPlacementThread01 thread01 = GuardPlacementQuest as GuardPlacementThread01
GuardPlacementThread02 thread02 = GuardPlacementQuest as GuardPlacementThread02
...
GuardPlacementThread09 thread09 = GuardPlacementQuest as GuardPlacementThread09
GuardPlacementThread09 thread10 = GuardPlacementQuest as GuardPlacementThread10
;The 'public-facing' function that our MagicEffect script will interact with.
ObjectReference function PlaceConjuredGuardAsync(ActorBase akGuard)
int i = 0
ObjectReference future
while !future
if !thread01.busy()
future = thread01.get_async(GuardPlacementFutureActivator, GuardPlacementFutureAnchor, akGuard, XMarker)
elseif !thread02.busy()
future = thread02.get_async(GuardPlacementFutureActivator, GuardPlacementFutureAnchor, akGuard, XMarker)
...
elseif !thread09.busy()
future = thread09.get_async(GuardPlacementFutureActivator, GuardPlacementFutureAnchor, akGuard, XMarker)
elseif !thread10.busy()
future = thread10.get_async(GuardPlacementFutureActivator, GuardPlacementFutureAnchor, akGuard, XMarker)
else
;All threads are busy; wait and try again.
Utility.wait(0.1)
i += 1
if i >= 100
debug.trace("Error: A catastrophic error has occurred. All threads have become unresponsive. Please debug this issue or notify the author.")
i = 0
;Fail by returning None. The mod needs to be fixed.
return None
endif
endif
endWhile
return future
endFunction
;A helper function that can avert permanent thread failure if something goes wrong
function TryToUnlockThread(ObjectReference akFuture)
bool success = false
if thread01.has_future(akFuture)
success = thread01.force_unlock()
elseif thread02.has_future(akFuture)
success = thread02.force_unlock()
;...and so on
elseif thread09.has_future(akFuture)
success = thread09.force_unlock()
elseif thread10.has_future(akFuture)
success = thread10.force_unlock()
endif
if !success
debug.trace("Error: A thread has encountered an error and has become unresponsive.")
else
debug.trace("Warning: An unresponsive thread was successfully unlocked.")
endif
endFunction
The PlaceConjuredGuardAsync() function handles making sure that our work gets delegated to an available thread. The function then returns a Future
once an available thread is found. Nearly as soon as a thread's get_async()
function is called, it begins working, while our calling MagicEffect script is free to do other things in the mean time.
Compile and attach this script to your GuardPlacementQuest. Once you've done that, go back to your GuardPlacementThread.
A bit of house-cleaning: You can now uncomment 2 lines (noted) that were commented out because the Thread Manager didn't exist yet. Uncomment those two lines and re-compile the GuardPlacementThread script. (It's not necessary to recompile all of the children.)
Back to the Future
A Future, in parallel processing, is the representation of an asynchronous operation. It can be thought of as a placeholder in lieu of your result until your result has arrived. Like the Google App Engine version that this was inspired by, when the Future is created, it will probably not have any results yet. Your script can store a Future
and later call the Future
object's get_result()
function. If the result has arrived, get_result()
returns it; otherwise, it waits for the result to arrive, and then returns it.
Let's create our Future:
scriptname GuardPlacementFuture extends ObjectReference
Quest property GuardPlacementQuest auto
ObjectReference r
ObjectReference property result hidden
function set(ObjectReference akResult)
done = true
r = akResult
endFunction
endProperty
bool done = false
bool function done()
return done
endFunction
ObjectReference function get_result()
;Terminate the request after 10 seconds, or as soon as we have a result
int i = 0
while !done && i < 100
i += 1
utility.wait(0.1)
endWhile
RegisterForSingleUpdate(0.1)
if i >= 100
;Our thread probably encountered an error and is locked up; we need to unlock it.
(GuardPlacementQuest as GuardPlacementThreadManager).TryToUnlockThread(self as ObjectReference)
endif
return r
endFunction
Event OnUpdate()
self.Disable()
self.Delete()
endEvent
This script should be compiled and attached to the Future Activator object we created earlier. Note the Type of the result; this could be changed to any data type you need to return.
Tying it All Together
Now that we've created our Threads, our Thread Manager, and our Future script, we can start to put them to work. Since we aren't calling the functions we want to execute directly, we need to change how we do things slightly.
The previous execution flow was:
- Call each function, one by one, and store the results. (
PlaceAtMe()
, etc)
The flow using threads now is:
- Call an Async function on our Thread Manager, and store the
Future
it returns. - Later, call the
get_results()
function of theFuture
to retrieve the results.
In our original ActiveMagicEffect script, we did all of our MoveGuardMarkerNearPlayer() and PlaceAtMe() calls in a row, getting a series of Actor references for our guards in return. We're going to modify that slightly to use our shiny new threaded placement system:
scriptname SummonArmy extends ActiveMagicEffect
Quest property GuardPlacementQuest auto
{We need a reference to our quest with the threads and Thread Manager defined.}
ActorBase property Guard auto
ObjectReference property GuardMarker auto
Actor Guard1
Actor Guard2
...
Actor Guard20
Event OnEffectStart(Actor akTarget, Actor akCaster)
if akCaster == Game.GetPlayer()
;Place actors according to the player's position, taking into account walls, obstacles, etc
;Cast the Quest as our Thread Manager and store it
GuardPlacementThreadManager threadmgr = GuardPlacementQuest as GuardPlacementThreadManager
;Call PlaceConjuredGuardAsync for each Guard and store the returned Future
ObjectReference Guard1Future = threadmgr.PlaceConjuredGuardAsync(Guard)
ObjectReference Guard2Future = threadmgr.PlaceConjuredGuardAsync(Guard)
ObjectReference Guard3Future = threadmgr.PlaceConjuredGuardAsync(Guard)
;...and so on
ObjectReference Guard19Future = threadmgr.PlaceConjuredGuardAsync(Guard)
ObjectReference Guard20Future = threadmgr.PlaceConjuredGuardAsync(Guard)
;Collect the results
Guard1 = (Guard1Future as GuardPlacementFuture).get_result()
Guard2 = (Guard2Future as GuardPlacementFuture).get_result()
Guard3 = (Guard3Future as GuardPlacementFuture).get_result()
;...and so on
Guard19 = (Guard19Future as GuardPlacementFuture).get_result()
Guard20 = (Guard20Future as GuardPlacementFuture).get_result()
endif
endEvent
Event OnEffectFinish(Actor akTarget, Actor akCaster)
if akCaster == Game.GetPlayer()
Guard1.Disable()
Guard1.Delete()
;...and so on
Guard20.Disable()
Guard20.Delete()
endif
endEvent
Here, instead of doing the work in our script, we delegated the work to the Thread Manager, and stored the Futures that it returned to us. Then, we gathered the results using our Futures' get_result()
function. We don't have to worry about our threads or the state of the Futures; those are freed up and cleared for us by the system.
Even though all of the threads are working in parallel and might not finish at the same time, the get_result()
function will wait until a result is available before returning. We can be sure that we will get the results even if they are processed out of order. For instance, if thread 2 completed before thread 1, calling the thread 1 Future's get_result()
function will pause the script until a result is available. Then the thread 2 Future's result is gathered, and so on.
Final Notes
- Be a good Papyrus and Skyrim citizen and read the results from your Futures as soon as you are able so that they can be disposed of. If Futures begin to pile up without being read and destroyed, save game bloat could occur.
- If you are running operations in an always-on background script that you want to multithread, and you will always have the same number of results back, it may make more sense for you to implement a static set of Future references that are never destroyed that you continue to reuse. This would prevent the churn of Futures being created and destroyed and may lend itself to faster performance. Keep in mind that this would probably result in some data loss if your Futures are not read from regularly as the new results overwrite the old ones.
- If something doesn't seem to be working correctly, turn on debugging and insert debug.trace() messages where you think your code may be failing. Running the game in windowed mode and using a real-time log viewer like SnakeTail is extremely helpful for diagnosing script problems.
- You can create as many threads as you want, but I wouldn't recommend more than 50 or so. It depends on your needs, the strain each thread places on the Papyrus VM, and how quickly you need your results.
- If you need to perform a set of actions that are not all the same, the Thread Manager pattern might not be best for you. You may want to create different thread base scripts purpose-built for several tasks and then call their get_async() functions directly, blocking on busy() until they're available. You can still run many different tasks concurrently this way, even if they're not the same.
Ideas
- A system could be created to dynamically scale the number of available threads up and down, depending on current system performance.
- A selection via a MessageBox or a SkyUI MCM slider could be created to allow the user to select the maximum number of available threads.
Conclusion
I hope this tutorial helps shed light on how you might implement your own multithreaded design pattern in your mods to tackle repetitive, resource-hungry tasks. If you have a question about the examples presented here, or if you see an error, please add your comments to the Discussion tab. Good luck, and happy modding!
- Chesko