This text was taken from
P.A.Wozniak, Economics of learning, Doctoral
Dissertation, University of Economics, Wroclaw, 1995
and adapted for publishing as an independent article on the Web. (P.A.Wozniak, Aug 21,
1998) For a similar study on the relationship between the IQ and the learning process see: Investigating the correlation between the intelligence and the performance in repetitive learning tasks |
Working together with Dr. Gorzelańczyk from Medical Academy of Poznań, I have subjected a number of high school students to a long-term learning process with the use of a uniform database, uniform working conditions, equal learning process duration and Algorithm SM-6 for spaced repetition based on approximating forgetting curves and making use of the concept of the forgetting index. The experiment is now in its third year [this text was written in 1994, the experiment continues in 1998], and data from over 30 students have been collected providing a unique opportunity to compare all learning parameters taken from students computer record files.
Similar, though much less uniform experiment, and on a smaller scale, has been conducted four years ago with the use of Algorithm SM-5 based on the direct modification of optimal factor matrices (Wozniak et al. 1994). The cumulative data from this earlier experiment has been used to determine the initial values of the entries in the optimal factor matrix used in Algorithm SM-6.
Subjects. The subjects were 32 volunteers, high school students, aged 18-20.
Material. 2500-item list of questions and answers related to biology material required at entrance examinations for Medical Academy in Poland (in Polish). For example:
Procedure. All subjects used Algorithm SM-6 for spacing repetitions and implemented in SuperMemo for Windows. The working time was 2-3 times a week from 20 to 50 minutes in a single session. The entire list of questions and answers was memorized within 2-3 months in a self-paced manner. Later, repetitions of the memorized knowledge continued for 6-7 months. All parameters of the learning process, including the parameters of the algorithmic procedure have been collected in computer files. Statistical analysis tools have been used to interpret the data as specified in each particular presented case.
Remarks. Not all subjects fully complied with requirements set before the experiments in reference to: (1) number of memorized items, (2) regularity of learning sessions, and (3) length of the post-memorization period. Consequently, only 20 data records have been selected for final analysis.
Before I step toward the statistical comparison of subjects learning process, I would like to present some interesting observations that have been made in reference to the possible causes of the differences between particular students. As Dr. Gorzelańczyk was personally involved in supervision of the learning process as well as in tutoring of the subjects on relevant topics, I asked him to grade each of the subjects with respect to general intelligence and attitude towards learning. Naturally, such grading is always greatly biased by subjective judgment of the supervisor; nevertheless, I considered it an important source of possible conclusions. The grading of intelligence did not show significant correlation with any of the parameters of the learning process measured in the experiment [see: Correlation between the intelligence and the retention in learning based on repetition spacing (Gorzelanczyk et al., 1998)]. However, I was able to use the results of entrance examinations of all the subjects as a general bench mark of the overall performance (all the subjects were candidates for students at Medical Academy of Poznań). A very interesting and surprising conclusions could be drawn from the correlation analysis on the learning-performance platform.
It is a common sense reasoning that good students learn faster than bad students; the fact that should be reflected by the parameters of the learning process. A natural intuition is that good students should exhibit low forgetting index, quick response time, high grades, etc.
My observation is, however, that in learning based on self-assessment, the opposite correlation appears to be true. Successful students apparently learned slower and appeared to forget items much more frequently than the unsuccessful students!
The interpretation of this paradoxical finding is that good students are by far more critical in the judgment of their own progress. It has been for long postulated in my earlier publications that there is very little difference between individuals as far as the mechanisms of memory are concerned. It is the way humans process information that sets them apart from each other. Consequently, little difference could be observed among the students in the ability to remember. However, those who appeared to be self-indulgent and lenient in self-assessment, usually showed much lower levels of knowledge retention in absolute terms (i.e. as judged by the supervisor).
The following general parameters of the learning process have been collected from the subjects in the course of the experiment (cf. Table 1 Comparison of learning parameters in a group of 20 subjects):
File | AGNIE | ANI | EWA | IZA | KARF | MARCI | MARYS | OLA2 | OLA1 | AGA |
Day | 232 | 178 | 186 | 198 | 244 | 241 | 242 | 192 | 256 | 240 |
Total | 2449 | 2449 | 2449 | 2449 | 2449 | 2449 | 2449 | 2449 | 2449 | 2859 |
Memoriz | 2449 | 1743 | 1954 | 1954 | 1104 | 2449 | 2449 | 1996 | 2449 | 2859 |
Intact | - | 706 | 495 | 495 | 1345 | - | - | 453 | - | - |
Outstand | 2449 | 1743 | 1954 | 1954 | 1104 | 2449 | 2449 | 1996 | 2449 | 588 |
Burden | 36.09 | 40.33 | 20.01 | 30.58 | 10.64 | 45.91 | 102.43 | 30.49 | 60.81 | 9.04 |
Time | 3.58 | 6.71 | 4.16 | 4.20 | 4.65 | 5.74 | 3.22 | 8.73 | 5.86 | 3.84 |
Workld | 2.15 | 4.51 | 1.39 | 2.14 | 0.82 | 4.39 | 5.49 | 4.44 | 5.94 | 0.58 |
Interval | 196 | 119 | 125 | 123 | 139 | 175 | 116 | 115 | 156 | 351 |
Factor | 2.618 | 2.509 | 2.574 | 2.558 | 2.556 | 2.606 | 2.517 | 2.604 | 2.627 | 2.689 |
Rep | 2.58 | 2.46 | 2.29 | 2.87 | 3.29 | 2.74 | 3.30 | 2.97 | 3.10 | 3.11 |
Day/Rep | 89.9 | 72.3 | 81.2 | 68.9 | 74.1 | 88.1 | 73.3 | 64.6 | 82.4 | 77.2 |
Lapses | 0.08 | 0.35 | 0.14 | 0.32 | 0.42 | 0.12 | 0.65 | 0.26 | 0.24 | 0.01 |
FI req | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
FI det | 5.03 | 18.28 | 9.67 | 13.85 | 14.61 | 6.26 | 19.44 | 10.89 | 9.62 | 0.31 |
FI cas | 4231 | 3370 | 2906 | 4534 | 3169 | 4582 | 8209 | 4676 | 6061 | 5845 |
Grade | 4.904 | 4.122 | 4.844 | 4.393 | 4.907 | 4.777 | 4.798 | 4.691 | 4.904 | 4.979 |
Last | 11.94 | 05.94 | 06.94 | 07.94 | 08.94 | 09.94 | 08.94 | 06.94 | 10.94 | 03.96 |
1.3s | - | 4 | - | 10 | - | - | 17 | 4 | 1 | - |
Dif 1 | 1.21 | 0.30 | 1.01 | 0.43 | 0.03 | 0.64 | 0.17 | 0.46 | 0.54 | 1.44 |
Dif 2 | 0.25 | 0.04 | 0.21 | 0.28 | -0.01 | 0.24 | 0.12 | 0.15 | 0.18 | 0.31 |
Dif 3 | 0.13 | 0.10 | 0.12 | 0.04 | 0.10 | 0.11 | -0.00 | 0.08 | 0.09 | 0.13 |
OF 1.3 1 | 5.46 | 4.07 | 7.56 | 2.76 | 9.70 | 5.43 | 2.15 | 1.72 | 1.46 | 2.75 |
OF 1.3 2 | 1.86 | 2.32 | 2.09 | 1.33 | 3.06 | 2.01 | 1.24 | 2.69 | 2.25 | 1.26 |
OF 1.3 3 | 1.39 | 1.40 | 1.39 | 1.53 | 1.65 | 1.53 | 1.88 | 1.59 | 1.82 | 1.33 |
OF 2.5 1 | 19.97 | 7.65 | 19.64 | 7.93 | 10.00 | 13.14 | 4.25 | 7.26 | 7.90 | 20.00 |
OF 2.5 2 | 4.90 | 2.75 | 4.67 | 4.65 | 2.91 | 4.94 | 2.72 | 4.45 | 4.36 | 5.00 |
OF 2.5 3 | 2.92 | 2.62 | 2.81 | 2.07 | 2.80 | 2.90 | 1.85 | 2.50 | 2.94 | 2.89 |
File | KAR | KASI | KASK | MAG | MIKO | MONI | OLA3 | SEBA | TOMA | NATA | Aver | Total |
Day | 184 | 246 | 235 | 237 | 245 | 248 | 232 | 264 | 250 | 266 | 232 | - |
Total | 2866 | 2859 | 2859 | 2859 | 2859 | 2859 | 2859 | 2788 | 2859 | 2859 | 2687 | 53426 |
Memor | 2866 | 2859 | 2859 | 2859 | 2859 | 2859 | 2859 | 2788 | 2859 | 2859 | 2526 | 49932 |
Outst | 725 | 567 | 918 | 1154 | 1123 | 754 | 838 | 641 | 459 | 916 | 1314 | 27230 |
Burden | 10.45 | 9.70 | 11.14 | 11.76 | 13.06 | 10.43 | 11.11 | 9.18 | 9.68 | 12.55 | 23.60 | 495.37 |
Time | 0.52 | 4.33 | 5.01 | 3.45 | 2.04 | 2.19 | 5.02 | 1.14 | 4.70 | 3.38 | 4.05 | - |
Workl | 0.09 | 0.70 | 0.93 | 0.68 | 0.44 | 0.38 | 0.93 | 0.17 | 0.76 | 0.71 | 1.77 | 37.63 |
Intervl | 301 | 324 | 292 | 295 | 249 | 313 | 323 | 332 | 324 | 272 | 238 | - |
Factor | 2.706 | 2.598 | 2.667 | 2.464 | 2.618 | 2.680 | 2.673 | 2.684 | 2.679 | 2.608 | 2.618 | - |
Rep | 3.14 | 3.10 | 3.07 | 3.36 | 3.10 | 3.02 | 3.01 | 3.03 | 3.01 | 2.95 | 2.98 | - |
Day/R | 58.5 | 79.4 | 76.5 | 70.6 | 78.9 | 82.1 | 77.1 | 87.2 | 83.1 | 90.3 | 77.8 | - |
Lapses | 0.01 | 0.01 | 0.05 | 0.18 | 0.16 | 0.05 | 0.05 | 0.03 | 0.04 | 0.16 | 0.16 | - |
FI req | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | - |
FI det | 0.37 | 0.43 | 2.36 | 7.03 | 6.84 | 2.55 | 2.48 | 1.50 | 1.81 | 7.48 | 6.71 | - |
FI cas | 6177 | 6029 | 6113 | 7458 | 6549 | 5992 | 5937 | 5745 | 5869 | 6081 | 5534 | 109 K |
Grade | 4.991 | 4.746 | 4.977 | 4.953 | 4.999 | 4.908 | 4.623 | 4.998 | 4.998 | 4.910 | 4.829 | - |
Last | 11.95 | 03.96 | 02.96 | 02.96 | 01.96 | 02.96 | 02.96 | 04.96 | 02.96 | 02.96 | - | - |
1.3s | - | - | - | 19 | - | - | - | - | - | - | 3 | 55 |
Dif 1 | 1.42 | 1.43 | 1.18 | 1.18 | 0.47 | 1.29 | 1.44 | 1.11 | 1.32 | 0.95 | 0.93 | - |
Dif 2 | 0.31 | 0.31 | 0.26 | 0.25 | 0.24 | 0.28 | 0.26 | 0.26 | 0.27 | 0.28 | 0.23 | - |
Dif 3 | 0.12 | 0.14 | 0.09 | 0.01 | 0.12 | 0.13 | 0.12 | 0.12 | 0.12 | 0.11 | 0.10 | - |
Table 1 Comparison of learning parameters in a group of 20 subjects
The following interesting facts have emerged from the cross-comparison of the above figures (note that standard arithmetic averages are used in the following paragraph as opposed to weighted arithmetic average based on the number of items per database in Table 1 Comparison of learning parameters in a group of 20 subjects):
Figure 7 Scattergram illustrating the correlation between the forgetting index and the average interval
Figure 8 Scattergram illustrating the correlation between the average number of memory lapses per item and the forgetting index
Figure 9 Scattergram illustrating the relationship between the forgetting index and the response time
Because of my long-lasting interest in the approximation of forgetting curve and the nature of forgetting itself, I expected to collect valuable evidence for the exponential nature of forgetting by compiling a cumulative forgetting curve for E-factor equal to 2.5 and repetition number equal to one. Data from student file records have been superimposed to plot the average forgetting curve for items that enter the learning process. As it can be seen, the very high retention at repetitions rendered the collected evidence far from conclusive; despite a very large number of repetition cases gathered (over 51,000 repetitions in total).
Figure 10 Cumulative forgetting curve for 20 students, E-factor 2.5, and repetition 1 (over 51 thousand repetitions collected)
In the presented figure, RF stands for R-factor, OF - O-factor, Cases - number of repetitions studied, d - forgetting decay constant from the equation retention=exp(-d*U-factor)), Dev - mean square deviation of experimental data from the retention curve approximated with the decay constant d.
A disappointing shortcoming of the assumed approach was a very high standard deviation of the detected forgetting index as reported earlier. As it is illustrated in the next figure, superposition of forgetting curves for different values of the forgetting index results in a U-shaped curve that shows little relevance with the true nature of forgetting (see Figure 11 Distorted forgetting curve resulting from differences in the forgetting index).
Figure 11 Distorted forgetting curve resulting from differences in the forgetting index
The U-shaped forgetting curve results from the fact that subjects with different values of the forgetting index, repeat items at different intervals, but the algorithm will always strive to make them forget no more and no less than the desired proportion specified by the forgetting index. This way, all students with intervals less than the maximum U-factor will tend to contribute to the forgetting curve around the point specified by the optimal interval, and their average retention, expressed in percent, will oscillate around 100 minus the forgetting index. Only the students whose intervals approach the maximum U-factor will show higher retention. Similarly, the highest retention will be registered for the shortest intervals; hence the U-shaped curve.
A 3-D representation of the cumulative matrix of retention factors is presented below (Figure 12 Cumulative matrix of retention factors). The matrix was obtained by superimposing forgetting curves corresponding with all R-factors taken from particular subjects.
Figure 12 Cumulative matrix of retention factors
In the figure presented above, the XYZ axes correspond respectively to the value of E-factor (from 1.3 to 3.2), repetition number (from 1 to 20), and to the value of R-factor expressed as percent of its maximum value. Note that for the sake of graph clarity, R-factors corresponding to repetition number greater than 2 were multiplied by 0.66 to expose the further located and more accurately estimated areas. The plain flat and plain down-sloping areas correspond to no repetition data available; hence they refer only to the model of average student (Wozniak et al. 1994). As opposed to the matrix of optimal factors, the figure illustrates a sharp contrast between the value of R-factors, and consequently the length of inter-repetition intervals across the range of E-factors. This contrast is marked, however, only for low repetition number. Because of the data collecting period limited to about 12 months, very few repetitions have been recorded in the area above the 3-rd repetition; hence much less visible differentiation of R-factors for different E-factor categories.
As the graphic presentation of the cross-section of retention factor matrices would require four dimensional figures, below I present such a cross-section flattened at the repetition number dimension. Thus, only the entries corresponding to the repetition number equal to one are presented.
In the figure presented above, the XYZ axes correspond respectively to the value of E-factor (from 1.3 to 3.2), subject number (subjects were sorted for forgetting index; lower values placed distally), and to the value of R-factor expressed as percent of its maximum value, which is 20 in the case of first repetition. The plain flat area corresponds to no repetition data available.
The down-sloping ridge corresponding with E-factor equal to 2.5 illustrates the influence of the forgetting index detected during repetition on the value of R-factors. The two peaks located at E-factor=1.3 and E-factor=1.8 illustrate the saltatorial flow of items down the E-factor axis in result of forgetting. The peaks result from high retention detected at repetitions of the forgotten items. The valleys placed in-between, do not indicate the inherently irregular nature of the matrix of R-factors, but show only the areas, where low number of repetition cases prevented establishing the accurate value of the matrix entries. The three-peak nature of the first row of the matrix of retention factors corresponding with repetition number equal to one disappears with the progression of the forgetting index toward higher values. Though the above observation might suggest adopting a sparser matrix of R-factors with fewer E-factor columns, the situation presented in the figure is not necessarily typical. The location of peaks, or even their appearance will greatly depend on the students grading habits, which influence the rate of change of E-factor values.
As in the case of cross-section of retention factor matrices, a cross-section of optimal factor matrices flattened at the repetition number dimension is presented below (Figure 14 Comparison of O-factors for repetition number equal to one). Only the entries corresponding to the repetition number equal to one are presented.
Figure 14 Comparison of O-factors for repetition number equal to one
In the figure presented above, the XYZ axes correspond respectively to the value of E-factor (from 1.3 to 3.2), subject number (subjects were sorted for forgetting index; lower values placed distally), and to the value of O-factor expressed as percent of its maximum value, which is 20 in the case of first repetition.
As the matrix of optimal factors is derived directly from the matrix of retention factors, a natural correspondence can be seen between the shape of the cross-analysis graph for O-factors and repetition number equal to one and the same graph for R-factors (cf. Figure 13 Cross-comparison of R-factors for the first repetition and varying E-factor among the subjects sorted for the forgetting index). The steady decrease of O-factors between the ridge at E-factor=2.5 and higher E-factor areas, in marked contrast to the same region in the corresponding R-factors graph, results from the application of on-line smoothing of the matrix of optimal factors in the process of learning. Analogously, the two peaks discussed in the case of R-factors comparison blended with the surrounding area providing for more regular spacing of repetitions across the E-factor matrix.
Yet more conclusive is the same graph plotted upon weighted Gaussian smoothing of the 3-dimensional matrix of optimal factors, i.e. the matrix built from optimal factor matrices extended by the student dimension. The weight used in Gaussian smoothing was the number of repetition cases recorded.
Figure 15 First layer of the 3-D matrix of optimal factors upon weighted Gaussian smoothing based on the number of repetition cases
In the graph presented above, which is a smoothed equivalent of the one presented earlier (see Figure 14 Comparison of O-factors for repetition number equal to one), it can be more clearly seen that three elements determine the value of the matrix of optimal factors for repetition number equal to one:
For low forgetting index, a particularly large difference between O-factors for E-factors equal to 2.5 and E-factors less than two, results not only from an inherently longer inter-repetition intervals for easier items, but also from the slower convergence of O-factors to their optimal value at low E-factor areas due to reduced number of repetition cases which drive the optimization.
Comparison of the distribution of intervals in particular subject file records shows that, for natural reasons, students with low forgetting index show less differentiation among item intervals, and that the average interval is greater. For example, the least successful students, with the lowest value of the detected forgetting index showed the greatest number of items in the 256-512 days slot. On the other end of the spectrum, the mode of distribution for the highest forgetting index coincided with the 64-128 days interval range (see Figure 16 Comparison of inter-repetition interval distribution among students sorted for forgetting index).
In the above graph, the XYZ axes correspond respectively to the interval category (note, that for the sake of graph clarity, the polarity of the axis was reversed), subject number (subjects were sorted for forgetting index; lower values placed distally), and to the number of items falling into the particular interval category (the Z line has not been calibrated because of its dependence on the size of the question-answer list).
As in the case of interval distribution, students with high forgetting index showed an increased differentiation of E-factors, though the mode of the distribution did not indicate greater difficulty of items among the students with higher forgetting rates. [The concept of E-factor in Algorithm SM-6 corresponds roughly to A-factors in Algorithm SM-11]
Figure 17 Cumulative distribution of E-factors among students sorted for forgetting index
In the presented graph, the XYZ axes correspond respectively to the E-factor category (note again, that for the sake of graph clarity, the polarity of the axis was reversed), subject number (subjects were sorted for the forgetting index; lower values placed distally), and to the number of items falling into the particular interval category (the Z line has not been calibrated because of its dependence on the size of considered databases).
The most striking observation coming from the comparison of E-factor distributions is that the tested list of questions and answers appeared to be surprisingly easy for all subjects. As a consequence, the graph shows a uniform ridge along the E-factor category of 2.6-2.7, and there is no perceptible bulging around the 1.3 category, which in most cases acts like a scavenger of bad items, and can be used in implementing programmatic filters that make it possible to eliminate ill-structured items from lists of questions and answers.