You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/kinase-algorithm.md
+4-5Lines changed: 4 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -206,13 +206,13 @@ At the point of learning, which occurs once at the end of a [[theta rhythm|theta
206
206
207
207
What kind of biological signal causes learning to take place _after_ the plus phase of activity, as opposed to a reversed sequence of an outcome phase followed by a subsequent prediction phase for example? Is there some distinctive neural signature that marks these phases so that the proper alignment occurs with sufficient reliability to drive effective learning?
208
208
209
-
While there is always the possibility that a global [[neuromodulator]] signal (e.g., dopamine) could provide the critical "learn now" signal, it is also the case that a local synaptic mechanism can provide a sufficiently accurate signal to work in practice in large-scale simulated models.
209
+
There is always the possibility that a global [[neuromodulator]] signal could provide the critical "learn now" signal, and indeed [[norepinipherine]] has appropriate properties and projects widely into the neocortex. However, given the different timing of activity in different cortical areas, and the relatively localized nature of sensory prediction errors for example, it would be more robust for a local, neuron-level signal to guide the timing of learning.
210
210
211
-
Specifically, the conjunction of significant pre and postsynaptic activity is actually sufficiently rare that there are typically relatively brief windows of synapse-specific activity followed by relative inactivity, and that this _transition to inactivity_ can mark the end of a prior plus phase.
211
+
As discussed in [[temporal derivative#Timing of learning]], there are two peaks of relative activity associated with the onset of the minus and plus phases, so that the minus phase peak can initiate the learning process, and the plus phase peak can finalize it. The total excitatory and inhibitory conductance impinging on a neuron robustly and relatively smoothly exhibits these peaks in terms of the difference between a fast vs. slow integration of this activity, so we use that to drive learning timing.
212
212
213
-
In biophysical terms, the CaMKII and DAPK1 competitive binding dynamic takes place when there is a relatively high level of Ca++ and activated calmodulin (CaM) in a relatively brief window after synaptic activity. Once this activity falls off, DAPK1 returns to its baseline state while CaMKII that has been bound to N2B remains active for a sufficient duration to trigger the AMPA receptor trafficking dynamics that result in actual changes in synaptic efficacy ([[@BayerGiese25]]). This process takes time, and requires relative DAPK1 inactivation to proceed, so it preferentially occurs during the transition to inactivity after a learning episode. Whatever final state the CaMKII vs. DAPK1 competition was in at the point of this transition determines the resulting LTP vs. LTD direction.
213
+
The initial minus-phase peak is generally larger and more robust, so it dominates by triggering the start of the learning process. After a parameter-dependent number of cycles (milliseconds) from the onset of the minus-phase peak, the plus-phase peak is detected, and a specified number of cycles after that, the postsynaptic learning signal is recorded into the `LearnDiff` variable, to drive learning.
214
214
215
-
This rule was implemented and tested extensively, and it worked well across a wide range of tasks. The "omniscient" version where we know when the plus phase ends still learns faster, and we continue to use that in our models to save computational time, but especially in the much larger scale of the actual mammalian brain, this issue does not appear to pose a significant problem for the overall biological feasibility of the kinase algorithm.
215
+
The peak-finding logic works robustly by recording the time and peak value whenever it exceeds the prior peak. This ratcheting-up dynamic continues until the empirical peak value is reached. When the time since the last peak exceeds the minimum minus cycle threshold, the peak baseline is reset and peak detection switches to finding the plus phase peak, using the same logic. When the cycles since that peak exceed the plus phase cycle parameter, learning occurs.
216
216
217
217
## Stabilization and rescaling mechanisms
218
218
@@ -386,4 +386,3 @@ $$
386
386
387
387
Using this equation provides significant benefits on tasks with temporal structure, typically with a $\tau_e$ factor of around 2-4 and $\lambda$ around 0.5.
s.Min.X.Ch(80) // clean rendering with variable width content
108
108
})
109
-
core.Bind(&diffStr, diffTx)
109
+
core.Bind(&dwtStr, dwtTx)
110
110
111
111
func updt() {
112
112
td()
113
113
dl.SetData(driver)
114
114
fl.SetData(fast)
115
115
sl.SetData(slow)
116
-
diffTx.UpdateRender()
116
+
dwtTx.UpdateRender()
117
117
pw.NeedsRender()
118
118
}
119
119
@@ -157,9 +157,124 @@ Some things you can try:
157
157
158
158
In summary, [[#sim_td]] based on the competition between two simple exponential integration equations ([[#eq_fast-slow]]) demonstrates that a locally computed temporal derivative can drive synaptic changes in a manner consistent with an error signal that emerges over time.
159
159
160
-
## When is the temporal derivative computed?
160
+
## Timing of learning
161
161
162
-
A critical issue with this temporal derivative framework is that the accurate computation of a prediction error signal must happen at a specific point in time relative to the onset of the actual outcome, which you can see in the above example in terms of the effects of the different time constants. The precise timing of the prediction signals is less critical, because any neural activity that precedes the outcome can be considered a prediction, and the cumulative effects of the learning will cause these prior activity states to become a prediction in any case.
162
+
A critical issue with this temporal derivative framework is that the accurate computation of a prediction error signal must happen at some point _after_the onset of the actual outcome. If learning happened during the minus (prediction) phase for example, it would learn _toward_the prediction state and _away_ from the prior outcome state!
163
163
164
-
The [[kinase algorithm]] provides an answer to this key question (TODO: summary here!).
164
+
Furthermore, the timing must allow the fast component sufficient time to deviate from the slow component, but not too much time, because then the difference will start to go away as the slow component catches up. In many Axon simulations, the inputs are presented at regular intervals for the sake of simplicity, and learning timing can be driven algorithmically. But how could it actually work in the brain?
165
+
166
+
There is a reliable timing signal available at each individual neuron, which could potentially drive the biological synaptic plasticity process, as illustrated in the following simulation. This signal is based on the way that the absolute value of the difference between `fast` -- `slow` evolves over time, plotted as `diff`:
167
+
168
+
{id="sim_diff" title="Timing for learning" collapsed="true"}
169
+
```Goal
170
+
fastTau := 10.0 // time constant for fast integration
171
+
slowTau := 20.0 // time constant for slow integration
172
+
pred := 50.0
173
+
out := 80.0
174
+
var dwtStr, fastStr, slowStr, predStr, outStr string
175
+
176
+
##
177
+
totalTime := 100
178
+
driver := zeros(totalTime) // driver is what is driving the system
179
+
fast := zeros(totalTime) // fast is a fast integrator of driver
180
+
slow := zeros(totalTime) // slow is a slow integrator of driver
181
+
diff := zeros(totalTime) // diff is the absolute value of fast - slow diff
You can see that across different combinations of prediction and outcome driver states, the `diff` value exhibits two distinct peaks: one at the start when the onset of prediction-phase activity drives `fast` and `slow` to change at their different rates, and another just after onset of the outcome (plus) phase. Therefore, if we trigger learning to occur some number of cycles (milliseconds) after the onset of the first second peak, it should generally happen around the end of the plus phase.
274
+
275
+
Because the duration of the minus and plus phases is not in principle reliable, both peaks need to be detected. The first, generally larger one can be thought of as a "priming" pulse that provides initial activation to the learning process, while the second one triggers the final adaptation process that is sensitive to the difference between the `fast` and `slow` components.
276
+
277
+
The one case where there isn't a second peak is when the outcome matches the prediction, where no learning will occur in any case. It is possible to add a timeout for learning after the first peak: if no second peak occurs within some amount of time, then everything resets and the process starts over.
278
+
279
+
In a spiking network (e.g., the [[kinase algorithm]] for Axon), the time integrated values that drive learning are not nearly as smooth as those in [[#sim_diff]], because they have a significant contribution from postsynaptic spiking. However, much smoother values are available in the total excitatory and inhibitory conductances coming into each neuron, which sample from a large number of other neurons. The same peak-driven logic works well in this case, and is used in the [[kinase algorithm]].
0 commit comments