Skip to content

Commit a6300f2

Browse files
committed
double-peak mechanism for cortical learning timing, allowing fully continuous learning. Working nearly as good as externally-driven regular timing.
1 parent 8e74ed6 commit a6300f2

File tree

2 files changed

+129
-15
lines changed

2 files changed

+129
-15
lines changed

content/kinase-algorithm.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -206,13 +206,13 @@ At the point of learning, which occurs once at the end of a [[theta rhythm|theta
206206

207207
What kind of biological signal causes learning to take place _after_ the plus phase of activity, as opposed to a reversed sequence of an outcome phase followed by a subsequent prediction phase for example? Is there some distinctive neural signature that marks these phases so that the proper alignment occurs with sufficient reliability to drive effective learning?
208208

209-
While there is always the possibility that a global [[neuromodulator]] signal (e.g., dopamine) could provide the critical "learn now" signal, it is also the case that a local synaptic mechanism can provide a sufficiently accurate signal to work in practice in large-scale simulated models.
209+
There is always the possibility that a global [[neuromodulator]] signal could provide the critical "learn now" signal, and indeed [[norepinipherine]] has appropriate properties and projects widely into the neocortex. However, given the different timing of activity in different cortical areas, and the relatively localized nature of sensory prediction errors for example, it would be more robust for a local, neuron-level signal to guide the timing of learning.
210210

211-
Specifically, the conjunction of significant pre and postsynaptic activity is actually sufficiently rare that there are typically relatively brief windows of synapse-specific activity followed by relative inactivity, and that this _transition to inactivity_ can mark the end of a prior plus phase.
211+
As discussed in [[temporal derivative#Timing of learning]], there are two peaks of relative activity associated with the onset of the minus and plus phases, so that the minus phase peak can initiate the learning process, and the plus phase peak can finalize it. The total excitatory and inhibitory conductance impinging on a neuron robustly and relatively smoothly exhibits these peaks in terms of the difference between a fast vs. slow integration of this activity, so we use that to drive learning timing.
212212

213-
In biophysical terms, the CaMKII and DAPK1 competitive binding dynamic takes place when there is a relatively high level of Ca++ and activated calmodulin (CaM) in a relatively brief window after synaptic activity. Once this activity falls off, DAPK1 returns to its baseline state while CaMKII that has been bound to N2B remains active for a sufficient duration to trigger the AMPA receptor trafficking dynamics that result in actual changes in synaptic efficacy ([[@BayerGiese25]]). This process takes time, and requires relative DAPK1 inactivation to proceed, so it preferentially occurs during the transition to inactivity after a learning episode. Whatever final state the CaMKII vs. DAPK1 competition was in at the point of this transition determines the resulting LTP vs. LTD direction.
213+
The initial minus-phase peak is generally larger and more robust, so it dominates by triggering the start of the learning process. After a parameter-dependent number of cycles (milliseconds) from the onset of the minus-phase peak, the plus-phase peak is detected, and a specified number of cycles after that, the postsynaptic learning signal is recorded into the `LearnDiff` variable, to drive learning.
214214

215-
This rule was implemented and tested extensively, and it worked well across a wide range of tasks. The "omniscient" version where we know when the plus phase ends still learns faster, and we continue to use that in our models to save computational time, but especially in the much larger scale of the actual mammalian brain, this issue does not appear to pose a significant problem for the overall biological feasibility of the kinase algorithm.
215+
The peak-finding logic works robustly by recording the time and peak value whenever it exceeds the prior peak. This ratcheting-up dynamic continues until the empirical peak value is reached. When the time since the last peak exceeds the minimum minus cycle threshold, the peak baseline is reset and peak detection switches to finding the plus phase peak, using the same logic. When the cycles since that peak exceed the plus phase cycle parameter, learning occurs.
216216

217217
## Stabilization and rescaling mechanisms
218218

@@ -386,4 +386,3 @@ $$
386386

387387
Using this equation provides significant benefits on tasks with temporal structure, typically with a $\tau_e$ factor of around 2-4 and $\lambda$ around 0.5.
388388

389-

content/temporal-derivative.md

Lines changed: 125 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ fastTau := 10.0 // time constant for fast integration
4444
slowTau := 20.0 // time constant for slow integration
4545
pred := 50.0
4646
out := 80.0
47-
var diffStr, fastStr, slowStr, predStr, outStr string
47+
var dwtStr, fastStr, slowStr, predStr, outStr string
4848
4949
##
5050
totalTime := 100
@@ -78,9 +78,9 @@ func td() {
7878
##
7979
}
8080
##
81-
diff := fast[-1] - slow[-1]
81+
dwt := fast[-1] - slow[-1]
8282
##
83-
diffStr = fmt.Sprintf("<b>Weight Change ΔW ≅ Predition - Outcome = Fast - Slow = %7.2g</b>", diff.Float1D(0))
83+
dwtStr = fmt.Sprintf("<b>Weight Change ΔW ≅ Predition - Outcome = Fast - Slow = %7.2g</b>", dwt.Float1D(0))
8484
}
8585
8686
td()
@@ -102,18 +102,18 @@ fig1.Legend.Add("Fast", fl)
102102
fig1.Legend.Add("Slow", sl)
103103
104104
105-
diffTx := core.NewText(b)
106-
diffTx.Styler(func(s *styles.Style) {
105+
dwtTx := core.NewText(b)
106+
dwtTx.Styler(func(s *styles.Style) {
107107
s.Min.X.Ch(80) // clean rendering with variable width content
108108
})
109-
core.Bind(&diffStr, diffTx)
109+
core.Bind(&dwtStr, dwtTx)
110110
111111
func updt() {
112112
td()
113113
dl.SetData(driver)
114114
fl.SetData(fast)
115115
sl.SetData(slow)
116-
diffTx.UpdateRender()
116+
dwtTx.UpdateRender()
117117
pw.NeedsRender()
118118
}
119119
@@ -157,9 +157,124 @@ Some things you can try:
157157

158158
In summary, [[#sim_td]] based on the competition between two simple exponential integration equations ([[#eq_fast-slow]]) demonstrates that a locally computed temporal derivative can drive synaptic changes in a manner consistent with an error signal that emerges over time.
159159

160-
## When is the temporal derivative computed?
160+
## Timing of learning
161161

162-
A critical issue with this temporal derivative framework is that the accurate computation of a prediction error signal must happen at a specific point in time relative to the onset of the actual outcome, which you can see in the above example in terms of the effects of the different time constants. The precise timing of the prediction signals is less critical, because any neural activity that precedes the outcome can be considered a prediction, and the cumulative effects of the learning will cause these prior activity states to become a prediction in any case.
162+
A critical issue with this temporal derivative framework is that the accurate computation of a prediction error signal must happen at some point _after_ the onset of the actual outcome. If learning happened during the minus (prediction) phase for example, it would learn _toward_ the prediction state and _away_ from the prior outcome state!
163163

164-
The [[kinase algorithm]] provides an answer to this key question (TODO: summary here!).
164+
Furthermore, the timing must allow the fast component sufficient time to deviate from the slow component, but not too much time, because then the difference will start to go away as the slow component catches up. In many Axon simulations, the inputs are presented at regular intervals for the sake of simplicity, and learning timing can be driven algorithmically. But how could it actually work in the brain?
165+
166+
There is a reliable timing signal available at each individual neuron, which could potentially drive the biological synaptic plasticity process, as illustrated in the following simulation. This signal is based on the way that the absolute value of the difference between `fast` -- `slow` evolves over time, plotted as `diff`:
167+
168+
{id="sim_diff" title="Timing for learning" collapsed="true"}
169+
```Goal
170+
fastTau := 10.0 // time constant for fast integration
171+
slowTau := 20.0 // time constant for slow integration
172+
pred := 50.0
173+
out := 80.0
174+
var dwtStr, fastStr, slowStr, predStr, outStr string
175+
176+
##
177+
totalTime := 100
178+
driver := zeros(totalTime) // driver is what is driving the system
179+
fast := zeros(totalTime) // fast is a fast integrator of driver
180+
slow := zeros(totalTime) // slow is a slow integrator of driver
181+
diff := zeros(totalTime) // diff is the absolute value of fast - slow diff
182+
##
183+
184+
func td() {
185+
fastStr = fmt.Sprintf("Fast Tau: %g", fastTau)
186+
slowStr = fmt.Sprintf("Slow Tau: %g", slowTau)
187+
predStr = fmt.Sprintf("Prediction: %g", pred)
188+
outStr = fmt.Sprintf("Outcome: %g", out)
189+
##
190+
d := array(pred) // current drive
191+
f := 0.0 // current fast
192+
s := 0.0 // current slow
193+
fTau := array(fastTau)
194+
sTau := array(slowTau)
195+
##
196+
for t := range 100 {
197+
if t == 75 {
198+
# d = array(out)
199+
}
200+
##
201+
f += (1.0 / fTau) * (d - f) // f moves toward d
202+
s += (1.0 / sTau) * (d - s) // s moves toward f
203+
driver[t] = d
204+
fast[t] = f
205+
slow[t] = s
206+
diff[t] = abs(s-f)
207+
##
208+
}
209+
##
210+
dwt := fast[-1] - slow[-1]
211+
##
212+
dwtStr = fmt.Sprintf("<b>Weight Change ΔW ≅ Predition - Outcome = Fast - Slow = %7.2g</b>", dwt.Float1D(0))
213+
}
214+
215+
td()
216+
217+
plotStyler := func(s *plot.Style) {
218+
s.Range.SetMax(100).SetMin(0)
219+
s.Plot.XAxis.Label = "Time"
220+
s.Plot.XAxis.Range.SetMax(100).SetMin(0)
221+
s.Plot.Legend.Position.Left = true
222+
}
223+
plot.SetStyler(driver, plotStyler)
224+
225+
fig1, pw := lab.NewPlotWidget(b)
226+
dl := plots.NewLine(fig1, driver)
227+
fl := plots.NewLine(fig1, fast)
228+
sl := plots.NewLine(fig1, slow)
229+
dfl := plots.NewLine(fig1, diff)
230+
fig1.Legend.Add("Driver", dl)
231+
fig1.Legend.Add("Fast", fl)
232+
fig1.Legend.Add("Slow", sl)
233+
fig1.Legend.Add("Diff", dfl)
234+
235+
236+
dwtTx := core.NewText(b)
237+
dwtTx.Styler(func(s *styles.Style) {
238+
s.Min.X.Ch(80) // clean rendering with variable width content
239+
})
240+
core.Bind(&dwtStr, dwtTx)
241+
242+
func updt() {
243+
td()
244+
dl.SetData(driver)
245+
fl.SetData(fast)
246+
sl.SetData(slow)
247+
dfl.SetData(diff)
248+
dwtTx.UpdateRender()
249+
pw.NeedsRender()
250+
}
251+
252+
func addSlider(label *string, val *float64, mxVal float32) {
253+
tx := core.NewText(b)
254+
tx.Styler(func(s *styles.Style) {
255+
s.Min.X.Ch(40) // clean rendering with variable width content
256+
})
257+
core.Bind(label, tx)
258+
sld := core.NewSlider(b).SetMin(1).SetMax(mxVal).SetStep(1).SetEnforceStep(true)
259+
sld.SendChangeOnInput()
260+
sld.OnChange(func(e events.Event) {
261+
updt()
262+
tx.UpdateRender()
263+
})
264+
core.Bind(val, sld)
265+
}
266+
267+
addSlider(&predStr, &pred, 100)
268+
addSlider(&outStr, &out, 100)
269+
addSlider(&fastStr, &fastTau, 50)
270+
addSlider(&slowStr, &slowTau, 50)
271+
```
272+
273+
You can see that across different combinations of prediction and outcome driver states, the `diff` value exhibits two distinct peaks: one at the start when the onset of prediction-phase activity drives `fast` and `slow` to change at their different rates, and another just after onset of the outcome (plus) phase. Therefore, if we trigger learning to occur some number of cycles (milliseconds) after the onset of the first second peak, it should generally happen around the end of the plus phase.
274+
275+
Because the duration of the minus and plus phases is not in principle reliable, both peaks need to be detected. The first, generally larger one can be thought of as a "priming" pulse that provides initial activation to the learning process, while the second one triggers the final adaptation process that is sensitive to the difference between the `fast` and `slow` components.
276+
277+
The one case where there isn't a second peak is when the outcome matches the prediction, where no learning will occur in any case. It is possible to add a timeout for learning after the first peak: if no second peak occurs within some amount of time, then everything resets and the process starts over.
278+
279+
In a spiking network (e.g., the [[kinase algorithm]] for Axon), the time integrated values that drive learning are not nearly as smooth as those in [[#sim_diff]], because they have a significant contribution from postsynaptic spiking. However, much smoother values are available in the total excitatory and inhibitory conductances coming into each neuron, which sample from a large number of other neurons. The same peak-driven logic works well in this case, and is used in the [[kinase algorithm]].
165280

0 commit comments

Comments
 (0)