Integrating SuperMemo with the Internet

Dr A. Szepieniec, Dr P. Wozniak, SuperMemo Corp., October 25, 1996

This text was an internal document originally entitled: Where do we want to go? and designed as a theoretical basis for the business plan for SuperMemo Corp. seeking venture capital

The progress of the mankind is a multifaceted phenomenon that spans the entire spectrum of human activity across all branches of science and technology down to the daily routine of a milkman or a housewife. Nothing, however, determines the pace of progress more visibly that the mankind’s ability to process information. And nothing but the information processing is so much subject to positive feedback that made it explode this century with ferocity that makes it truly impossible to predict what boundaries it is going to crash just a mere ten years from now. Forty thousand years ago, humans started communicating orally. It took until four thousand years BC before they were able to put down their message in writing. Then only in the 15-th century, Gutenberg’s invention made it possible to widely disseminate writing. The next breakthrough came with the advent of computers this century. Particularly in the 1980s with the explosive growth of desktop computing on one hand and networking on the other. The most recent revolution originated in the European Laboratory for Particle Physics research labs in Switzerland. Yet in 1993, only the most devoted Internet insiders new its name: World Wide Web. The creators of WWW developed several simple application-layer protocols and a document-publishing standard. The tree key concepts were URLs, HTML and HTTP. These have unleashed the global hunger for more easily available information and for the benefits of publish-as-you-go. The vision of a global hyperspace has finally become reality and at its best: without borders and without (or nearly without) government control. Bandwidth limitations permitting, there seems to be no end to the exponential growth of the Web and its technological versatility. It is our strong conviction that the next revolution in information processing will come with cognitive technologies. This is a collection of technologies that make use of the newest findings in the field of psychophysiology that affect information processing on the part of a human subject. SMC has pioneered a number of such technologies. Most prominently, repetition spacing algorithms, commercially known worldwide as SuperMemo. In short, cognitive technologies make documents 'understand' the reader by taking into account the imperfection of his or her memory and cognition. This approach is possible by keeping track of user’s navigation in the knowledge space and by creating mathematical models of his or her memory.

SMC’s mission is to provide humans with the most efficient interface to the world of information with application of all known cognitive technologies.

Cognitive technologies optimize information processing and learning at the following four pivotal areas (in parentheses: concepts developed at SMC):

SMC is currently working over Project SM-XXI (currently at the design stage) that will be a collection of software components that will make the following vision a reality in the XXI century:

  1. the interface to external sources of information will cover both electronic and non-electronic sources (the latter case will require tools for easy incorporation of information coming from external sources within the knowledge system paradigm)
  2. the electronic sources will be both general and dedicated. SMC will work on promoting standards that will allow publishers of information to comply with cognitive technologies so that to increase the proportion of dedicated sources with the lapse of time
  3. for information sources: all platforms should be covered one way or another: desktop operating systems (CD-ROMs), Internet, handheld devices, dedicated databases and knowledge systems developed for SuperMemo 7, SuperMemo 8, its successors, and many more
  4. for information publishing: all platforms should be covered as well: stand-alone desktop applications, CD-ROM publishing, Internet, client-server environment, handheld devices, voice-operated systems, etc.
  5. the main application modes will be as follows: (1) stand-alone application (as with earlier versions of SuperMemo), (2) course application (e.g. for CD-ROM title publishing), (3) Internet application (esp. for organizing and learning web knowledge), (3) client-server application (e.g. for education in schools, for corporate training, etc.), and (4) tele-learning application (client-server approach over the Internet).
  6. from the user standpoint, knowledge will dynamically flow from the disorganized collection of items, Web pages, CD-ROMs, and other sources into the knowledge hierarchy: a graph of semantic connections between individual knowledge elements (Note: do not confuse knowledge hierarchy with a collection of hyperlinked pages)
  7. all knowledge elements (leaves of knowledge hierarchy) having the form of pieces of information or external information sources will be provided with processing attributes that may assume the following values: intact (not yet classified), suppressed (classified as irrelevant and made invisible in the knowledge system), dismissed (dismissed as cognitively relevant but valued for future reference), reviewed (reviewed and considered cognitively relevant; perhaps worth the pending status), pending (consider particularly important for its associative or inferential nature and scheduled for later committing) and committed (committed to memory of the user of the knowledge system).
  8. editable knowledge hierarchy that visualizes processing attributes of knowledge elements forms a knowledge chart that makes it easy to graphically view the user’s progress in wading through a sea of information
  9. all non-primitive elements in dedicated sources will be divided into semantic units that will also be provided with processing attributes. In non-dedicated sources, wherever possible, semantic units will be separated by means of available technologies (e.g. HTML tags, parsing tools, or simply highlighting tools; in the latter case, the users will be able to set processing attributes to an equivalent of a semantic unit in the form of a display area highlighted with a mouse)
  10. processing attributes will determine the appearance and behavior of elements or their semantic units. For example, suppressed elements will disappear from view and made their URLs unavailable, committed elements will crop up in repetitions scheduled by means of SuperMemo, etc.
  11. ordinal attributes will be used to sort elements and semantic units with a view to their future processing. The following processing attributes will be associated with ordinal attributes: intact (ordinal attributes will determine the order of review; this attribute can only be set automatically by means of filtering tools, HTTP connection score, elements of information democracy, etc.), reviewed (ordinals will determine the order of the next review), pending (ordinals will determine the order in which items are committed to memory), committed (ordinals will determine rescheduling priority or uncommitting priority in cases of repetition overload).
  12. semantic attributes will be used to approximate the semantic contents of a semantic unit or element. These are needed for search and filtering purposes. The simplest approach to implementing semantic attributes is a keyword system. Future applications might make use of natural language processing technologies.
  13. knowledge filters can be used to determine visibility or accessibility of elements or semantic units within the knowledge system by making use of processing, ordinal and semantic attributes in dedicated sources and word context analysis in non-dedicated sources. Knowledge filters are a useful tool for addressing disorganized knowledge before it enters the knowledge hierarchy and in automatically determining ordinal attributes in the intact pool. Knowledge filters can also be used in thematic navigation.
  14. knowledge meters are tools for diagnostics and control of the flow of information between element pools tagged with different processing attributes. In general, the flow proceeds from disorganized/external pool to intact pool, then to suppressed, dismissed and reviewed pool, then to pending pool and finally to committed pool. Some stages may be skipped (e.g. committing element without placing it in the pending queue), and some backflow is not unusual (e.g. dismissing once committed item as result of the loss of relevancy, etc.). Knowledge meters allow to view the information flow as well as to impose minimum or maximum flow limits.
  15. knowledge should be divided into topics (elements that present information, like pages in a help system, web pages, etc.) and items (elements that have a stimulus-response structure, e.g. question and answer, that can be effectively used in the process of learning based on active recall)
  16. for training and tele-learning purposes, item subsets should be structured for automatic grading purposes (e.g. with multiple choice-test, spelling test, automatic voice recognition test, etc.)
  17. topics will have local and remote nature (e.g. as a web link executed via OLE browser in-place activation)
  18. application of items complying with the minimum information principle is not customary in present information sources. A number of solutions will have to be adopted to facilitate the transition to the new approach by content providers. Most importantly, World Wide Web extensions are inevitable. The adjustments will have to be made at both the server and the client side. The new generation of web browsers provide an easy plug-in interfaces that can be addressed with format-independent OLE Documents (to extend or go beyond HTML), language-independent binary ActiveX controls, and scripting languages. On the server side, with standard methods such as CGI, WWW servers can be extended to communicate with back-end scripts, dynamically produce the content of a web page, store information the user has provided, etc. In Microsoft Internet Server changes are possible via ISAs and other ISAPI extensions.
  19. for knowledge retention, Project SM-XXI envisages application of most modern SuperMemo algorithms based of algebraic and algorithmic solutions combined with neural networks
  20. to let learning run in the background, without the need for a special time slot of the users, intelligent, self-configuring techniques for detecting idle state of the user’s terminal will be implemented (e.g. popping up the drilling procedure at download time, printing time, and other long-drawn processes; including selected operations in custom-chosen applications)
  21. although the theoretical foundations and underlying concepts might seem intricate, the new software solutions should provide an intuitive and fool-proof interface that will open the world of well-structured knowledge to every user with or without his or her understanding of cognitive processes
  22. particular software components will be used for complementary solutions like: handheld devices, voice-operated devices, redistributable modules for third-party developers, etc.

Beyond simple software solutions:

Marketing strategy and the step-wise education of the public on cognitive technologies

Glossary (proprietary terminology is marked with SMC)