SuperMemo: Statistics windows

ABC	Hints&Tips	Overviews	Reference	Glossary	Support
Contents : Technical reference : Statistics

Statistics window

The statistics windows, Statistics and Element data can most conveniently be viewed by pressing F5 (or choosing Window : Layout : Classic layout). This arranges the statistics windows in the way first introduced in SuperMemo 3.0 in 1988. The caption of the statistics window displays the name of the collection in square brackets.

Learning parameters displayed in the statistics window:

Date - current date and the day of the week. If this value is preceded with Night, it means that the new day has already started but the old repetition day will still last as long as defined in Midnight clock shift. In the example on the right, the picture snapshot was taken past midnight on the first day of spring 2002 (Thursday). The collection in use is named "all" as shown in the caption
First day - date on which the learning process began (i.e. the day on which the first element was memorized). The exemplary collection presented in the picture has been in use since December 15, 1987 (i.e. the birth date of SuperMemo for DOS)
Day - number of days in the learning process (i.e. number of days between Date and First day).
Day=Date-First day.
The presented collection has been in use for 5211 days (i.e. 14 years and 97 days)
Total - the number of items, topics and tasks in the collection.
Memorized+Pending+Dismissed=Total.
Deleted elements do not contribute to the total count of elements in the system. In the picture, the presented collection is made of near 137,000 elements
Items+Topics - the number of items and the number of topics (and tasks) in the collection.
Items+Topics=Total.
In the example, the collection includes over 82,000 items and over 54,000 topics (or tasks)
Memorized - total number of elements introduced into the learning process with options such as Learn or Remember. If an item takes part in repetitions it is a memorized item. It does not mean it is a remembered item (see FAQ). The presented collection has nearly 116,000 elements in the learning process and these elements make up 94.7% of all elements in the collection, i.e. Memorized/Total=0.947
Memorized items - the number of memorized items in the collection and the proportion of memorized items among memorized elements. In the example, about 73,500 items take part in repetitions. These items make 63.5% of all memorized elements. The retention field (below) indicates that 94.11% of these items should be remembered at any given time
Memorized topics - the number of memorized topics and the proportion of memorized topics among all memorized elements. In the example, over 42,000 topics make 36.5% of the material taking part in the learning process. In incremental reading, topics easily make a half of the material taking part in repetitions
Memorized/Day - number of elements memorized in the collection per day. In the example, the average of 22.2 elements have been memorized daily in the presented collection over the last 14 years
Pending - the number of elements (topics or items) that have not yet been introduced into the learning process but still await memorization with Learn, Remember, etc. All pending elements are kept in the so-called pending queue that determines the sequence of learning new elements. Dismissed items are not kept in the pending queue. In the example, the collection contains 6491 elements in the pending queue. With incremental reading and selective postpone tools, the role of the pending queue in SuperMemo is diminishing
Dismissed - the number of elements (topics, items or tasks) that have been excluded from the learning process and are kept only as reference material, knowledge tree skeletal material, or tasklist material. Dismissed items are neither pending nor memorized. All tasks are dismissed by default, i.e. they usually do not take part in repetitions. In the example, nearly 15,000 elements have been dismissed. In incremental reading, you will often keep parent articles for context and reference until children items are full formulated, moved to their target categories and provided with all necessary context. This is why the proportion of dismissed material is often quite high in collections subject to incremental reading
Outstanding - number of outstanding items, outstanding topics and final drill items scheduled for repetition on this given day. The first number (before the plus sign) indicates the number of items scheduled for this given day and not yet processed. The second number (after the plus sign) indicates number of topics scheduled for review for this day. The third number (after the second plus sign), if present, indicates the number of items that have already been repeated today but scored less than Good (4). Those are the items that make up the final drill queue. The final drill queue is built only if Skip final drill is unchecked in Options. In the presented collection, there are still 64 items scheduled for repetition on March 21, 2002. There are also 350 topics scheduled for review on that day (most likely as part of incremental reading). There are no elements in the final drill queue (the third components of Outstanding is missing)
Retention - estimated knowledge retention in the collection. In the example, nearly 95% of the material should be recalled in a random test on all elements in the collection at any time. You can test your retention using random tests and see if SuperMemo's estimates are accurate. This statistic may be overly optimistic if you abuse rescheduling tools such as Postpone or Mercy
Measured FI - the value of the measured forgetting index as recorded during repetitions. The measured forgetting index is the proportion of items not remembered during repetitions. The number in the parentheses indicates Measured FI for the current session (i.e. since the opening of the collection). It is quite usual to have Measured FI higher than Average FI. This is due to two factors: (1) every user will experience delays in repetitions from time to time (e.g. as a result of using Postpone), (2) SuperMemo imposes some constraints on the length of intervals that, in some cases, make it schedule repetitions later than it would be implied by the forgetting index. The constraints in computing intervals, for example, prevent the new interval from being shorter than the old interval (assuming the items has not been forgotten). For low values of the forgetting index and items with a low A-Factor, the new optimum interval might often be shorter than the old one! In the presented example, 11.54% of repetitions end with grades less than Pass (since the measured forgetting index record was last reset). In the current session on Mar 21, 2002, 11.5% of item repetitions ended in failure (i.e. grade less than Pass (3))
Average FI - the average requested forgetting index in the entire collection (the number in parentheses is the default forgetting index). If the forgetting index of individual elements is not changed manually, Average FI is equal to the default forgetting index as set in Tools : Options : Learning : Forgetting index. The default forgetting index is the requested forgetting index given to all categories and, as a result, to all new items added to the collection. Forgetting index, in general, is the proportion of items that are not remembered during repetitions. The lower the value of the forgetting index the better the recall of the element, but the more repetitions will be needed to keep it in memory. Optimum value of the forgetting index falls into the range from 7% to 13%. Too low a forgetting index makes learning too tiresome due to a prohibitively large number of repetitions. All elements can have their desired forgetting index set individually. The easiest way to change the forgetting index of a large number of elements is to use Forgetting index option on the contents menu in the contents window or in the browser. In the presented example, the average forgetting index is 10.07% while the default forgetting index is 10%. See: Using forgetting index
Burden - estimation of the average number of repetitions per day. This value is equal to the sum of all interval reciprocals (i.e. 1/interval). The interpretation of this number is as follows: every item with interval of 100 days is on average repeated 1/100 times per day. Thus the sum of interval reciprocals is a good indicator of the total repetition workload in the collection. The presented collection requires about 752 repetitions per day. In incremental reading, it is not unusual to have many more topics in the process than one can handle. Postpone can be used to unload the excess of topics at the end of the learning day or session
Burden +/- - the change of the Burden parameter above in the present session. Here, on Mar 21, 2002, the average number of expected repetitions increased by 17.3 elements per day. This large increase could come from a heavy use of incremental reading tools that make it possible to schedule new extracts with the new interval of one day. 17 such extracts would increase Burden by 17*(1/1)=17 items per day
Workload - the average daily time used for responding to questions in a given collection.
Workload = Burden*Avg. time. In the presented collection, 752 repetitions per day taking 8.5 seconds each result in a daily repetition time estimation of over 1 hours and 46 minutes. A real learning session may be twice longer due to grading, editing, reviewing the collection and various interruptions. It may also be much shorter if Postpone is used
Subset - number of elements scheduled for subset repetition (e.g. elements in the random test queue in Tools : Random tests, elements in branch repetitions in Contents : Learn, elements in a browser subset repetitions in browser's Learn,etc.). The display may have a form of <items left>+<topics lef>+<pending left>+(<subset type>) in subset tests, or <elements unprocessed>/<all elements in the test> in random tests. Here 64 elements are scheduled for subset review. None of these are pending (i.e. all are memorized). In incremental reading, subset repetitions are most often executed in the contents window or in the browser with Learning : Learn (Ctrl+Alt+L) or with Learning : Review. In the later case, not only outstanding elements are reviewed. The remaining elements are subject to mid-interval review as well
Rep count - the total count of repetitions made in the collection. In the presented collection, 771 thousand repetitions have been made. This is about 6.7 repetitions per memorized element (this count includes repetitions of items that have been reset, forgotten, dismissed, deleted, etc.)
Time - total question response time in the current session and the total session time (in parentheses). Here the total time needed to respond to questions in the current sessions was 4 minutes and 30 seconds in a session that has lasted over 1 hour and 46 minutes. Significant differences in response and session time occur when the collection is intensely edited, in incremental reading (passive review does not contribute to response time) or when the user simply takes a break from learning
Avg. time - average response time in seconds (i.e. the time between displaying the question and choosing Show answer or equivalent). In the presented collection, the average time to answer a single question is over 8.5 seconds. If this number grows beyond 10-20 seconds, you may need to analyze your learning material if it is not overly difficult or badly structured
Total time - total time taken by responding to questions in the collection. This time cannot be accurately measured for collections created with SuperMemo 98 or earlier. The exact measurements of this parameter were only made possible in SuperMemo 99. If you upgrade your collections to SuperMemo 99, SuperMemo will assist you in estimating this number. If you upgrade directly to a later version, this number will roughly be guessed for you. SuperMemo will derive this time from the total number of items, average number of repetitions, average number of lapses and the average repetition time. In the presented example, answering questions during repetitions took over 113 days of non-stop learning (in 14 years of the learning process)
Lapses - average number of times individual items have been forgotten in the collection (only memorized elements are averaged). The number in parentheses shows the number of lapses in the current session. Here an average element has been forgotten 0.587 times. In the presented session, three lapses have occurred
Speed - the average knowledge acquisition rate, i.e. the number of items memorized per year per minute of daily work. Initially this value may be as high as 100,000 items/year/minute (esp. if you enthusiastically start working with the program before truly measuring its limitations; or rather the limitations of your memory); however, it should with time stabilize between 40 and 400 items/year/minute. In the presented collection, every minute of work per day resulted in 48 new items memorized each year. As this value is derived from Burden, it may be highly underestimated if you use Postpone a lot (e.g. in incremental reading)
Cost - the cost in time of memorizing a single item, i.e. total learning time divided by the number of memorized items. Cost = Total time / Memorized
In the presented example, the total repetition time per single item is 1.4 minutes. In other words, each item has contributed 1.4 minutes to the total of non-stop 113 days and 19 hours of repetitions. The cost of editing, restructuring, incremental reading, etc. is not included in this number
Daily cost - daily repetition time per each newly memorized item.
Daily cost = Workload / (Memorized/Day)
In the presented collection, each of the 22 newly memorized items per day contributes about 4.8 minutes of repetitions to the total workload of 1 hour 46 minutes per day. As this value is derived from Burden, it may be highly underestimated if you use Postpone a lot (e.g. in incremental reading)
Interval - average interval among memorized elements in the collection. Here an average memorized element has reached the average interval of 860 days
Repetitions - average number of repetitions per memorized element in the collection. Here an average element has been repeated 4.4 times
Last Rep - average date of the last repetition among memorized elements in the collection. Here the average date of the last repetition is October 10, 2000
Next Rep - average date of the next repetition among memorized elements in the collection.
Next Rep = Last Rep + Interval
Here the average date of the next repetition is February 18, 2003 or 860 days after October 10, 2000
Completion - the expected date on which all elements from the pending queue will be memorized assuming the present rate of learning new items.
Completion=Date+(Pending/(Memorized/Day))
In the example, it would take until January 2003 to memorize all 6491 pending items at the speed of 22 items per day (beginning with March 21, 2002)
A-Factor - average value of A-Factor among memorized items in the collection. A-Factor is a measure of item difficulty. The higher the A-Factor, the easier the item. In the presented collection, the average A-Factor is about 4.09. This indicates that the collection is rather well-structured and the material is thus relatively easy to remember

Right click over this picture and choose Open Link in New Window to open this picture in a separate window. This will help you compare the text with data in the picture:

Comments:

Items are added to the final drill not only during standard repetitions when you grade an element below Good (4). Operations such as Remember (Ctrl+M), Remember Cloze, and Add to drill (Shift+Ctrl+D) will also extend the final drill queue. The final drill queue is created only if you uncheck Tools : Options : Learning : Skip final drill
Some fields of the statistics window can be edited. For example: Measured FI, Total time, Rep count, etc. To edit and entry, click it, type the new value and press Enter. If the entry cannot be modified SuperMemo will warn you (e.g. "Retention entry cannot be modified").
See Survey 1994 and Survey 1999 for some interesting notes about the speed of learning reached with SuperMemo

FAQ

In SuperMemo, Memorized<>Remembered
Retention statistic assumes regular repetitions and well-structured learning material

In SuperMemo, Memorized<>Remembered
(jj, UK, Sunday, December 24, 2000 1:54 AM)
Question:
I have noticed in the Statistics that the number of elements memorized increases even when I enter Fail when answering incorrectly. For instance, in the collection of US States Capitals, it was showing 100% memorized when I was still getting many of them wrong
Answer:
Parameter Memorized indicates the number of elements in the learning process; not the number of elements you are able to recall correctly. If you make regular repetitions in the long run (i.e. over weeks and months), the number of elements you will be able to recall will equal Memorized*Retention

Retention statistic assumes regular repetitions and well-structured learning material
(dansujp, Sun, Sep 16, 2001 3:07 PM)
Question:
When I returned from vacation, I expected the retention to be something like 80% because I have not done any repetitions for two weeks. But it was exactly the same as before I left
Answer:
The Retention statistic is derived directly from the measured forgetting index on the assumption of a negatively exponential forgetting curve. This curve is only representative of well-structured learning material. In addition, the forgetting index measurements are averaged over all recorded cases. A break in repetitions will invalidate the statistic. Resuming repetitions is not a guarantee of accuracy as the large number of earlier repetitions will result in overestimating the retention on a small-sample measurement. The only valid estimation of retention after a break in learning is the one that follows resetting the past forgetting index record (File : Tools : Reset parameters : Forgetting index record). This will result in gathering new data that will approach true retention for the sample tested with accuracy proportional to the number of repetitions done