-
Notifications
You must be signed in to change notification settings - Fork 12
Assuring Absolute QoS Guarantees for Heterogeneous Services in RINA Networks with ΔQ
- Sergio Leon Gaixas (@gaixas1) - Universitat Politècnica de Catalunya (ES)
In this tutorial, we show how to configure and use basic ΔQ scheduling policies to provide differentiable treatment to services given their QoS. In addition, a first approach to the interaction between ΔQ policies and congestion control is shown, allowing for a reasonable overbooking of resources while maintaining QoS requirements (with the dynamic data-rate reduction of flows given based on their requirements).
The scenario used in this tutorial can be found under the folder “/examples/Tutorials/DeltaQ_Scheduling”, and is composed by the following files:
-
net.ned: Network description.
-
omnet.ini: Network and overall configuration.
-
QTA.xml: Configuration of the QTAMux used for ΔQ policies.
-
shimqoscube.xml: QoSCubes definition for shim-DIFs.
-
{cong/free}_qoscube.xml: QoSCubes definition for upper DIFs in congestion controlled and free scenarios.
-
connection{shim/set3/set9}.xml: Definition of preconfigured flows.
-
qosreq.xml: QoS requirements of preconfigured flows.
-
data{0/1x3/1x9/10x3}.xml: Data flows definition for the different configurations.
-
directory.xml: Configuration of IPCP locations
The network described in “net.ned” is a 6 nodes network describing containing the sub-set of datacenter nodes as seen in figure 1. In this network, the main flows to consider are those departing from node A towards nodes B and C, being those full ToR-2-ToR flows. In addition, emulating the bandwidth usage of other flows that could collide with those, multiple flows between nodes in that path are allocated.
Figure 1. Tutorial network.
While in all scenarios, the ΔQ policies inform of congestion in the form of ECN marking, QoSs in the Free* scenarios are configured to ignore them, resulting in high losses for low cherished flows under periods of high load. For the Cong* scenarios, instead, QoSs are configured to reduce the data-rate of the aggregated flows according to the arrival of ECN marked PDUs, resulting in low losses, even in periods of high load of flows with high priority. In terms of flows and QoS. In this scenario we considered a 3x3 Cherish/Urgency matrix. For each position, we define the QoS identifier as A*, B* and C* from more urgent to less, and *1, *2 and *3 from more cherished to less. Given the urgency of flows, we considered 3 different of aplications:
-
QoSs A*: Realtime voice traffic. ON/OFF traffic with small PDUs and without retransmission.
-
QoSs B*: Video on demand and web browsing. ON/OFF traffic with MTU sized PDUs and retransmission of losses.
-
QoSs C*: Filetransfer. Constant traffic with MTU sized PDUs and retransmission of losses.
With those QoSs, we considered 3 configurations for each, considering flows of 1 or 10 Mbps and using the full Cherish Urgency matrix or only the triad A2, B1 and C3. In total, for this tutorial we consider 6 different configurations for the scenario:
-
Without congestion control:
-
Free1Mbps3QoS
-
Free1Mbps9QoS
-
Free10Mbps3QoS
-
With congestion control:
-
Cong1Mbps3QoS
-
Cong1Mbps9QoS
-
Cong10Mbps3QoS
###Net.ned The configured network is a rather simple one, with 6 nodes partially connected using DatarateChannels of 200Mbps and 1Gbps with low delay. Even so, small difference with respect to other examples can be seen in the use of “Inf_Router” with injector “VDT” as nodes and the addition “VDT_Listener”. Those two modules are the base for the data injection into the upper DIF and harvest of statistics of that traffic respectively. More information on those next.
###Omnet.ini Before all, for this scenario, PDUtracing, and all kinds of statistic recordings outside the one provided by our modules have been stopped and its recommended to left them like that given the duration of the simulations and large number of PDU messages generated.
Now let’s explain the basic configuration of the network. For node addressing, we used a simple A to F naming of nodes. From those, each shim-DIF (those at ipcProcess0[*]) take receive its name as the concatenation of the addresses of both extremes. For each of those shims, a basic flow (uncontrolled) is allocated at t=0.
In the middle layer, we find the DIF “Fabric”, containing all nodes. There is where the ΔQ policies will be located. In there, we will use the QoSs of either the "free_qoscube.xml" or "cong_qoscube.xml" depending on the scenarios and the aggregated flows configured in either "connectionset3.xml" or "connectionset9.xml" will be pre-allocated after t=100 and before t=200.
In the top layer, we find the DIF “Net”, where data will be injected and then forwarded using the flows at Fabric. In this layer, we configure the same QoS as for its lower one, but no flow is allocated as we are going to directly inject its PDUs into the RMT. Instead, data will be injected after t=200.
####Routing and forwarding
In order to rely correctly PDUs between the different DIF levels, routing and forwarding policies are configured as follows:
-
At shims, not needed, “on wire”.
-
At Fabric DIF we use the simple forwarding policy generator “SimpleGenerator”, with a link state routing algorithm, “SimpleLS” and using the “MiniTable” forwarding policy (exact match + QoS)
-
At Net DIF we don’t use routing (“DummyRouting” policy set) and use the “OFStaticGenerator” forwarding policy generator with the forwarding policy “SimpleTable” to simply have direct access to the N-1 flows stablished through the Fabric DIF.
## At Fabric
**.ipcProcess1.resourceAllocator.pdufgPolicyName = "SimpleGenerator"
**.ipcProcess1.relayAndMux.ForwardingPolicyName = "MiniTable"
**.ipcProcess1.routingPolicyName = "SimpleLS"
## At Net
**.ipcProcess2.resourceAllocator.pdufgPolicyName = "OFStaticGenerator"
**.ipcProcess2.relayAndMux.ForwardingPolicyName = "SimpleTable"
**.ipcProcess2.routingPolicyName = "DummyRouting"
####Queues and scheduling
The next step if the configuration of the scheduling policies and queue related stuff. First, the default queue thresholds are configured as an arbitrary large number as the ΔQ policies don’t use the MaxQueue hook, so we avoid then any possibility for it to be executed. Now, we are interested in the configuration of both shim-DIFs and the Fabric DIF (Net DIF uses the default best-effort configuration as there we have only one queue per RMT Port).
- Shim-DIFs configuration
Shim-DIFs in this scenario mimic the real operation of real shim-DIFs, with minimal policies and small buffers that blocks itself when full. We configure one queue per flow requested (“QueuePerFlow” and “IDPerFlow”) with the pair of monitor and scheduling policies “IterativeStopMonitor” and “IterativeScheduling”.
In this case, as we only have one working flow per shim-DIF, what this policy does is to signalize “full” to upper flows after having more than “stopAt” PDUs in queue and “not full” when going under a second threshold “restartAt”, in this case configured at the minimum of 1 and 0 respectively.
**.ipcProcess0[*].**.queueAllocPolicyName = "QueuePerNFlow"
**.ipcProcess0[*].**.queueIdGenName = "IDPerNFlow"
**.ipcProcess0[*].relayAndMux.maxQPolicyName = "TailDrop"
**.ipcProcess0[*].relayAndMux.qMonitorPolicyName = "IterativeStopMonitor"
**.ipcProcess0[*].relayAndMux.schedPolicyName = "IterativeScheduling"
**.ipcProcess0[*].relayAndMux.queueMonitorPolicy.stopAt = 1
**.ipcProcess0[*].relayAndMux.queueMonitorPolicy.restartAt = 0
- Fabric configuration
In the Fabric DIF is where ΔQ policies are configured. As ΔQ policies are configured per queue, we have multiple options available. Here we considered the configuration of ΔQ per QoS, therefore allocating one queue for each QoS in use (“QueuePerNQoS” and IDPerNQoS”).
The scheduling policy for ΔQ is the simple “QTASch” policy, that basically works querying the monitor policy for the next queue to serve. Then, in the “QTAMonitor” monitor policy, we have all the logic of ΔQ within a configurable module. For this module we require to configure “shaperData” and “muxData”, with the configuration of queue shapers and the CU multiplexer respectively. We will examine later the configuration of those.
**.ipcProcess1.**.queueAllocPolicyName = "QueuePerNQoS"
**.ipcProcess1.**.queueIdGenName = "IDPerNQoS"
**.ipcProcess1.relayAndMux.qMonitorPolicyName = "QTAMonitor"
**.ipcProcess1.relayAndMux.schedPolicyName = "QTASch"
**.ipcProcess1.relayAndMux.queueMonitorPolicy.shapersData = xmldoc("QTA.xml", "Configuration/shapers")
**.ipcProcess1.relayAndMux.queueMonitorPolicy.muxData = xmldoc("QTA.xml", "Configuration/mux")
####Data injection Finally, whit the network configured, we are going to configure the data injection and recollection of statistics. First of all, as stated when configuring the “Net.ned” file, in this scenario data is injected directly into the RMTs of the Net DIF IPCPs. This means that there are no upper applications nor EFCP instances to retrieve them. Instead, we use the “Inj” module to generate large amount of data and the “Inj_Comparator” to retrieve it at end-point.
The first step will be to configure the duration of the simulation. First, the starting point of data injection is configured in “Inj.ini”, in the same way, the “stop” moment is configured in “Inj.fin”. In this case, it has to be noted that “Inj.fin” only sets the moment at which flows will stop requesting more data, so new data can be still created to complete old requests after that moment. In order to really stop the simulation at a given moment, we should also configure “sim-time-limit”.
sim-time-limit =305s
**.Inj.ini = 200
**.Inj.fin = 300
Next, the configuration of the injected flows. While most of it is done via the xmls configured as “Ing.data”, explained latter, there are also some parameters configurable in the ini file. First, two parameters shared between all flows, the length of the headers that lower DIFs will add as “headers_size” (in this case 22 bytes) and the value of the ACK timer, “ackT” (here left as its default value 0.1s).
Then the different flow generators can be configured. In this case, we set the average duration of ON and OFF periods of voice flows, “V_ON_Duration_AVG” and “V_OFF_Duration_AVG” respectively at 1/3s and 2/3s, with PDUs from 100 to 400 bytes. Then we configured also the average data-rate in Kbps in each configuration (“V_AVG_FlowRate” for voice (A*), “D_AVG_FlowRate” for video (B*) and “T_AVG_FlowRate” for data (C*)) and the data-rate of video flows during requests (“D_ON_FlowRate”).
**.Inj.headers_size = 22
**.Inj.V_ON_Duration_AVG = 1/3
**.Inj.V_OFF_Duration_AVG = 2/3
**.Inj.V_PDUSize_min = 100
**.Inj.V_PDUSize_max = 400
#Configuration 1Mbps flows x 3QoS
**.Inj.V_AVG_FlowRate = 1000
**.Inj.D_AVG_FlowRate = 1000
**.Inj.D_ON_FlowRate = 1500
**.Inj.T_AVG_FlowRate = 1000
**.Inj.data = xmldoc("data1X3.xml", "flows")
Finally, in order to capture statistics. By default, the VDT_Listener saves its results into “stats/{CONFIG_NAME}{RUN}.results”. In addition, we may cout them if the “printAtEnd” parameter is set. Also, if the parameter “recordTrace” is set, it will generate a trace following all PDUs generated and received by the different flows, generating the binary file “stats/{CONFIG_NAME}{RUN}.trace” and the index of flows “stats/{CONFIG_NAME}_{RUN}.traceinfo”.
* As traces can become pretty big fast, it is recommended to have them turned off. Also, the usage of them is not considered in this tutorial, not explained here (as a note, those are sequences “trace_t” strucs, as described in “src/Addons/DataInjectors/FlowsSimulation/Implementations/VDT/VDT_Listener.h”
In addition to end-2-end statics collected at the VDT_Listener, we can also configure the QTAMonitor to extract some information on incoming and outgoing data. For this, we first have to set the parameter “recordStats” at true and optionally give a nodeName to the IPCP (otherwise the module path is used). Then, there are 4 types of data that can be recorded per port setting depending on the parameters turned on:
-
pdu_IO: Number of PDUs arriving at an out port, dropped and served.
-
data_IO: Amount of data arriving at an out port, dropped and served.
-
pdu_IOi: Number of PDUs arriving at an in port.
-
data_IOi: Amount of data arriving at an in port.
After deciding which data record, we have to configure also the interval between recorded frames (“record_interval”, by default 0.1s) and the starting and end of the recording (“first_interval” and “last_interval”). Finally, if we set the parameter “saveStats” as true, we will be generating “stats” and “in.stats” files for the different ports.
**.vdt_Listener.recordTrace = false
**.ipcProcess1.relayAndMux.queueMonitorPolicy.recordStats = true
**.A.ipcProcess1.relayAndMux.queueMonitorPolicy.nodeName = "A"
**.B.ipcProcess1.relayAndMux.queueMonitorPolicy.nodeName = "B"
**.C.ipcProcess1.relayAndMux.queueMonitorPolicy.nodeName = "C"
**.D.ipcProcess1.relayAndMux.queueMonitorPolicy.nodeName = "D"
**.E.ipcProcess1.relayAndMux.queueMonitorPolicy.nodeName = "E"
**.F.ipcProcess1.relayAndMux.queueMonitorPolicy.nodeName = "F"
**.ipcProcess1.relayAndMux.queueMonitorPolicy.saveStats = true
**.ipcProcess1.relayAndMux.queueMonitorPolicy.printAtEnd = false
**.ipcProcess1.relayAndMux.queueMonitorPolicy.pdu_IO = true
**.ipcProcess1.relayAndMux.queueMonitorPolicy.pdu_IOi = true #record intervals
**.ipcProcess1.relayAndMux.queueMonitorPolicy.data_IO = true
**.ipcProcess1.relayAndMux.queueMonitorPolicy.data_IOi = true #record intervals
**.ipcProcess1.relayAndMux.queueMonitorPolicy.first_interval = 200.0
**.ipcProcess1.relayAndMux.queueMonitorPolicy.last_interval = 300.0
###XML files
####QTA.xml The QTA,xml is divided into two different parts, the configuration of shapers and mux.
Shapers are configured per queue and allows to both select the cherish/urgency levels of queue references passed to the mux and to add internal contention of flows going through the same queue. For each configured queue, we have a shaper entry with id = ”queue_name” (note that out queues have their name starting with the prefix “outQ”). Also, we have to configure the shaper “type”. Here we are using the most basic shaper for all queues (type = 0) that simply pass queue references to the mux as soon as those arrive. Additionally, we have con configure the urgency and cherish values used in this shader (in this case given by the position of QoS in the Cherish/Urgency matrix).
Other available shader types add burst limitation, random spacing, etc. but those and their configuration are not considered in this tutorial.
QTA.xml -> shaders
Apart from configuring queue shapers, the QTA.xml also has the configuration for the mux used in the QTAMonitor. Here we also have multiple mux available. In this scenario, we used the mux of type = 3. The configuration of this mux goes as follows:
-
maxU: Max urgency level accepted (has to be considered that 0 is the most urgent and maxU the least urgent). Urgencies are translated into strict priorities.
-
maxC: Max cherish level accepted (cherish levels depend on their configuration, so this only limit its number).
-
defaultEthTh: Default threshold to mark ECN randomly
-
defaultEthTh: Default threshold to mark ECN always
-
defaultEthprob: Default probability of marking ECN
In addition, we have to configure manually the cherish thresholds for each Cherish level (CTh object, where “C” is the cherish level, “ath” the absolute drop threshold and “th” and “p” the threshold and probability of random drop). Additionally, we may configure ECN thresholds per queue, differentiating the requirements with respect to data-rate reduction on overbooked scenarios.
QTA.xml -> mux
####data*.xml The multiple data*.xml files describe the amount and QoS of flows between the different nodes in the network. Flows are defined as sets. Each set sets a src and dst node defined by its address, the number of voice, video and data flows (“V”, “D” and “T” respectively) and their QoS (“Vq”, “Dq” and “Tq”). While each set defines three types of flows between the nodes, it is possible (and required) to use more than one set of nodes if we want to generate more variations of type and QoS.
data.xml
* It has to be noted that flows are started at src nodes. That means that for video and data flows the src node acts as the client, only doing the requests and Acks. When configuring the network this has to be taken into account, as the main flow will be directed then from dst to src.
Results of these simulations are stored into the “stats” folder under the example directory (need to be created). Depending on the final configuration used, multiple files will be generated with the global results and/or per port statistics. Here is the description for each type of results file:
- /{CONFIG_NAME}_{RUN}.results
This file contains the final statistics of all flows. Its formatting is easy to read. Flows are identified as “:: SRC -> DST \ QoS [ID]”.
It shows the minimum, average and max delay of all packets in the flow plus the number of sent and received packets and the % of correctly received packets.
Voice
:: A -> B \ A2 [0]
min/avg/max | 5.05701 / 11.8633 / 24.1447 ms
sent/recv/% | 489153 / 224710 / 45.9386 %
:: A -> B \ A2 [1]
min/avg/max | 4.5791 / 11.8584 / 24.0886 ms
sent/recv/% | 489648 / 230629 / 47.101 %
:: A -> C \ A2 [2]
min/avg/max | 5.84523 / 13.5288 / 23.4093 ms
sent/recv/% | 503114 / 131950 / 26.2267 %
:: A -> C \ A2 [3]
min/avg/max | 0.208441 / 13.7658 / 23.3605 ms
sent/recv/% | 505386 / 129482 / 25.6204 %
:: A -> D \ A2 [4]
min/avg/max | 3.09197 / 5.50461 / 12.1036 ms
sent/recv/% | 499212 / 303537 / 60.8032 %
...
- /{CONFIG_NAME}_{RUN}.SRC.DST.stats
This files contains the statistics at output ports per interval. Some values may not be computed depending on the configuration or the whole file may not be created.
Node D
Port A
QoS pdu_in pdu_out pdu_drop data_in data_out data_drop
Interval 0 200
Interval 1 200.1
A1 1 1 0 10 10 0
A2 3 3 0 30 30 0
A3 3 3 0 30 30 0
B1 0 0 0 0 0 0
B2 11 11 0 426 426 0
B3 34 34 0 814 814 0
C1 380 380 0 9172 9172 0
C2 499 499 0 12100 12100 0
C3 682 682 0 16616 16616 0
Interval 1 200.2
...
* Note: Given the duration of running this scenario and that the reader is encouraged to play with the different configurable parameters, no fingerprint check is done in this simulation.