Application design

The application will be designed to approximate the principles of operation of four popular algorithms for page replacement in the problem of virtual memory management: optimal, aging, LRUr to best match the activities taking place in each of the algorithms, the object-oriented programming paradigm will be used, and the process of software development will be supported by the latest technologies. The result of the program will allow you to compare the algorithms in terms of optimality and time of operation.

Technologies and design tools

The program executing selected algorithms of virtual memory management will be written in Python version 3.6.

The project will be implemented in PyCharm 2018.2 Professional Edition. It is an integrated development environment (IDE) for the JetBrains Python programming language. It provides, among others, editing and analysis of source code, graphical debugger, running unit tests, integration with the version control system.

Cooperation in the implementation of selected algorithms will enable a distributed Git version control system. It is software used to track changes mainly in the source code and to help in combining changes made to files by many people at different times.

In addition, the project will be hosted by GitHub, which is designed for programming projects using the Git version control system.

Algorithms concept

The scheme of algorithms was mainly searched for in Internet sources and literature.

Clock algorithm

The clock algorithm is a more efficient version of the second chance algorithm (FINUFO, First In Not Used First Out), because the pages do not have to be constantly moved at the end of the list, but have the same general function as the second chance.

The algorithm stores a cyclical list of pages in memory (clock face), and the iterator (clock pointer) indicates the last page examined in the list. When a page error occurs and there are no empty frames, the R (reference) bit is checked at the indicator location. If R is 0, the new page is inserted in place of the page indicated. Otherwise, the R bit is reset, and the clock iterator is incremented.

The process is repeated until the page is replaced.

The implementation uses a second chance algorithm with a recurring list. When a page exchange is required, but no free pages can be found, the "demon swap" is run, it prints all the pages that are busy on the disk. This helps to get fewer page errors at the expense of more disk writes.

This decision was made to avoid page errors, which is the main criterion for assessing the effectiveness of the algorithm.

Last Recently Used (LRU)

This algorithm assumes that pages that were most often used in the past will also be the most used in the future. While LRU provides almost optimal performance in theory, it is expensive to implement.

This method uses a linked list containing all pages in memory. At the end of this list is the least used page, in turn the most used page is at the beginning. The cost of this implementation is due to the fact that the list must be updated after each reference to the memory, which is time-consuming.

Another method of implementing the algorithm can be a method with hardware support. It requires a hardware counter, which is incremented after each instruction. When accessing the page, it receives the value of the counter from the access time to the page.

The implemented implementation uses the counter, which reduced the time complexity of the algorithm.

Optimum algorithm (OPT)

This is theoretically optimal algorithm. Known as OPT or Bélády's algorithm

When a page needs to be replaced, the operating system chooses the page that will not be used for the longest. The algorithm can not be fully implemented because it is not possible to predict precisely when the page will be needed. Exceptions may be provided for system actions.

Despite the limitations, the algorithm exists and provides an almost optimal operation, but only when the program's memory usage scheme is known (the condition must be small variability in the course of the program). Therefore, the algorithm does not apply to programs that operate for the first time, or in programs with variable waveforms, i.e. large differences in memory references.

In the implementation of this algorithm, it was assumed that the page will be exchanged, the first occurrence of which will not be used for the longest time in the future - a simplified version of the algorithm.

Aging

The aging algorithm is derived from the NFU (Not Frequently Used) algorithm.

The algorithm NFU requires a counter. Each page in the page table has its own counter, initially set to 0. In each of the time intervals, all pages referenced during this interval increment their counters. As a result, the counters store the number of page usage. The page with the lowest counter value is replaced when needed.

The main problem with NFU is that it tracks the number of pages used without information about the time of use of these pages. This favors the pages heavily used at the beginning. An example of such inefficiency might be the start of the operating system.

In turn, the aging algorithm takes into account the length of time the pages are used. Instead of just incrementing the counter of each page, the bit OR of the oldest bit of the counter with the page reference bit is performed. Every n made by the meter is divided by 2.

Example for a page with reference bits 1, 0, 0, 1, 1, 0 and OR execution with each instruction: 10000000, 01000000, 00100000, 10010000, 11001000, 01100100.

It follows that references closer to the present have more influence than the older ones. This provides a higher priority for parties with recent appeals (however, with a lower frequency of appeals), at the expense of pages with high frequency of referrals but occurring in the past. The page with the lowest value of the counter is replaced.

The limit of this method is the length of the counter. The program assumes that the counter is 16-bit.

An additional parameter distinguishing this algorithm is the refresh rate of the counters, which is the number of addresses retrieved, after which each counter will be divided by 2.

Class diagram

Only the algorithm classes are shown in the class diagram; classes from the algorithms module. The written program has no inheritance except basic, followed by object.

Class diagram for algorithms

Below is the class diagram for the tests module. Test classes, i.e. TestOpt, TestLru, TestAging and TestClock inherit from TestCase. Other classes, TableStates and PublicParams were created exclusively as auxiliary classes for testing.

Diagram of test classes

Structure of input data

In order to simulate and compare the operation of all the page replacement algorithm you need:

number of shared frames in the page table (default 3),
a file containing subsequent references to the memory along with a reading/writing operation (default tests/resources/test.trace),
counter refresh rate (only for the aging algorithm, default 5).

The program can be run according to the default arguments:

$ python vmsim.py

or giving arguments:

$ python vmsim.py --numframes 8 [--refresh 6] --tracefile othertrace.trace

The file structure with the trace extension is also important. The following is an example from the test file test.trace:

In each line the first 8 characters (hexadecimal) are 32-bit memory address, of which the first 20 bits (5 hexadecimal characters) are the virtual page number, and the next 12 bits (3 characters) hexadecimal) is an offset within a 4 kB page. Then there is one space followed by W or R character, corresponding to the types of write operations and read.

Structure of results

The result of the program in which all algorithms are performed for the same set of data is saved to the CSV file. The default location of results is the data folder in the main branch of the project. Depending on the number of pages in the input file, a subfolder is created for the results whose name corresponds to [number of pages]_frames. Then in such a folder the resulting file is created, in the naming convention [number of frames]_frames.csv.

The file contains the following information:

alg - the name of the algorithm.
trace_file - file name with input data.
frames - the user-defined number of frames used to execute the algorithms.
total_mem_access - the total number of memory accesses.
page_faults - number of page errors for a given algorithm.
writes - number of recordings made to disk storage.
refresh - counter refresh rate (number of its iterations).

This parameter is significant only in the aging algorithm, therefore for the others it is N/A.
total_time - total time devoted to the execution of a given algorithm given in milliseconds.

The following is an example of the generated results for a file with 15,000 pages called 15000.trace and for 32 frames:

alg,trace_file,frames,total_mem_access,page_faults,writes,refresh,total_time
Clock,15000.trace,32,15000,11729,1948,N/A,272.733
LRU,15000.trace,32,15000,11762,3683,N/A,235.619
Aging,15000.trace,32,15000,11745,3477,5,691.59
Opt,15000.trace,32,15000,8980,3234,N/A,721.038

As described above, such a file is generated to a file in the location (relative to the project) data/15000_trace/32_frames.csv.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly