MCMC approach to modelling queuing systems

The paper presents some numerical results on modelling a M/G/1/∞ queuing system using Markov chain Monte Carlo (MCMC). This particular technique was chosen in order to draw samples from the service time distribution from which it is assumed complicated to sample random numbers. The software for experiments was created using C++ Builder development environment.


Introduction
Consider a single channel queuing system depicted in Fig.1. λ represents the arrival rate (number of arriving customers per time unit) and µ is the service rate (number of customers being serviced per time unit). When the service time has a complicated or unknown distribution, it is difficult (or even impossible) to draw samples from that distribution. In real world this can present a situation, in which the process of arriving customers is a Poisson process and the service time has an unknown distribution. In this situation it is possible, for example, to construct a kernel density estimate for service time distribution if there are some empirical data given. But the latter estimate does not indicate how to sample from the service time distribution. Consequently, the necessity for special sampling technique emerges. MCMC was chosen to deal with this question.

MCMC and Metropolis-Hastings algorithm
If a service time density function π(·) is taken, Y i ∼ π(y), i = 1, n, have to be generated to run the simulation of the queuing system. The idea of MCMC is to construct a Markov chain {X i } ∞ i=0 such that lim i→∞ P (X i = x) = π(x), Every Markov chain can be determined through an initial state (1) and a transition kernel (2). It is known that the stationary distribution is unique if Markov chain is ergodic: Suppose the discrete stationary probabilities π(X i ) had been given. Then, having ergodic and discrete Markov chain, the equation (3) holds. Total number of (n − 1) equations and n(n − 1) unknown transition kernel probabilities are apparent. Thus there exist an infinite number of transition kernels representing the stacionary distribution π(x). Any of those transition kernels can be constructed and used for generating X i . One of the most widely used methods for constructing such Markov chain is Metropolis-Hastings algorithm [2].
Metropolis-Hastings algorithm is implemented as follows. At first an optional transition kernel Q(y|x) is chosen. Then there exists a probability α for chosen kernel Q being equal to transition kernel P : Considering the detailed balance condition of a time-homogeneous Markov chain we have: π(x)Q(y|x)α(y|x) = π(y)Q(x|y)α(x|y), ∀x = y.
Sampling of each X i is performed in 4 steps. Firstly a candidate point X i is drawn from proposal distribution. Then the probability α i , that this point is also distributed by the target density, is calculated. The next step is to draw u i ∼ U (0; 1) and compare it to α i . Finally, X i is accepted to the sample if u i < α i . Otherwise From (6) it is evident that π(x) can be determined up to a multiplicative constant c, i.e., π(x) = c · h(x), where h(x) is a probability density function. Having chosen Q(x|y) ≡ Q(x), an MCMC independence sampler is implemented. In that case there is no neccesity for Markov chain to loose its memory because each X i depends on ratio π(·) Q(·) and not on X i−1 .

Proposal selection and convergence of MCMC
Suppose we have a M/G/1/∞ queue in which service time X is a random number having lognormal distribution f t (x, µ, σ) with parameters µ = 0.5 and σ = 0.6. Nevertheless this density could be sampled using normal distribution, say it is complicated to sample from it. Here comes the necessity for special sampling technique. A proposal density Q must be chosen now. There are many techniques to achieve this, because the problem is to find probability distribution similar in shape to the target distribution. One way of achieving this is to take Erlang distribution f e (x, k, λ) and compare its first 3 moments according to moments m 1 , m 2 and m 3 of the target distribution.
By doing this, equations (7) for k and λ are obtained. Substituting these values into f e (x, k, λ) make it target density approximation, that is, proposal density. Difficulties arises when performing such approximation. k calculates as a floating point number and needs to be integer. There are cases when rounding k gives undesirable results, in such cases approximation is more precise if 1 or 2 is added to k. The second drawback is the non-versatility of this method. For particular distributions custom k and λ are found, but approximation is undesirable.
In this particular case k = 2, λ = 0.882 and f e (x, k, λ) is quite similar in shape to f t (x, µ, σ). Let us denote f π (·) to be an empirical probability density obtained by MCMC. The histogram, consisting of 100 bars, of sampled numbers X i , was taken as f π (·). The differences between the target and the proposal densities or between the target and empirical densities are evaluated as: According to Table 1, the convergence of MCMC independence sampler to the target distribution is rather slow.

Numerical model of M/G/1/∞ queuing system
Consider a queuing system with λ = 0.4 when service time distribution is f t (x, 0.5, 0.6). The software created represents a queuing system run as a dynamic chain of objects in computer memory. Each object has attributes just as a particular customer. The first attribute is arrival time to the system. The second attribute is service time. The time in queue, the time in system and etc are calculated recursively during modelling process. Arrival time is generated using inverse cumulative exponential distribution, although service times is said to have difficult distribution and, therefore, it is sampled using MCMC. The approximation of target density with Erlang distribution function was used as a technique for constructing a proposal density.
Although MCMC convergence to the target distribution is a matter-of-course, the result of this sampling technique is still a Markov chain, i.e., numbers are dependent. Fig. 3 shows that there are moments when X i+1 = X i . Before using these numbers for modelling a queuing system it is essential to scramble the sample to prevent from being dependent. A simpler way is to accept every second (third, etc.) random number to the sample while running MCMC. The more f t (·) differs in shape from the proposal density, the more numbers should be missed. If MCMC process run is dependent, there is a probability for two or more service times in a row being equal and relatively high. This results in the queue being much higher in length than the average queue for some time. According to the experiments carried out, time and queue characteristics were higher than the theoretical ones if dependency was not eliminated from the sample.
Having generated a single process run of independent random values, the average value of customers in queue or system and the mean value of time they were in queue or system can be evaluated. It is advisable to generate several process runs and calculate the average estimates. By doing this the dispersion of an estimate is being reduced. The system's empirical characteristics are compared to theoretical ones in order to evaluate the accuracy of the model. M/G/1/∞ queuing system's theoretical characteristics are calculated in accordance with Little's and Pollaczek-Khinchin formulas [1]. Figs. 4 and 5 show the mean absolute value of the average relative error of each of the queue's characteristics for specified λ. These results were obtained by modelling 50000 arriving customers for 10 times for each λ.
Any characteristic is calculated more precisely if λ is small. This dependence is reasonable because the less customers enter the system, the shorter the queue is. But the purpose of this experiment is to obtain a numerical value of what the average error will be for this particular system. Considering this relationship, one can decide whether modelling a system with particular parameters will give desirable results. It is also noticeable from Figs. 4 and 5 that system's characteristics are evaluated more precise than the queue characteristics. When modelling the system by usual techniques (e.g., inverse cumulative distribution function), the precision of its characteristics has similar trends. In several cases it is possible to achieve a smaller rate of error with MCMC, when random numbers sampled with MCMC have a better quality (e.g., if scrambled they can be less dependent).

Conclusions
1. The higher the ratio λ µ , the higher the dispersion of estimated system characteristics. The modelling also shows that errors for each system characteristics are dependent (Figs. 4 and 5). 2. Heavier tails of the service time distribution leads to the higher dispersion of characteristics modelled. 3. The results obtained enables us to choose how many arrivals of customers we have to sample in order to get desired precision of system characteristics while knowing its parameters. 4. When using MCMC for queuing systems or other type of logical aggregates, the sample X i ∼ π(x) must be scrambled to prevent X i from being dependent.