Iterative learning control for multi-agent systems with impulsive consensus tracking

Xiaokai Cao , Michal Fečkan , Dong Shen , JinRong Wang Department of Mathematics, Guizhou University, Guiyang 550025, Guizhou, China xkcaomath@126.com; jrwang@gzu.edu.cn Department of Mathematical Analysis and Numerical Mathematics, Faculty of Mathematics, Physics and Informatics, Comenius University in Bratislava, Mlynská dolina, 842 48 Bratislava, Slovakia michal.feckan@fmph.uniba.sk Mathematical Institute, Slovak Academy of Sciences, Štefánikova 49, 814 73 Bratislava, Slovakia School of Mathematics, Renmin University of China, Beijing, China dshen@ieee.org School of Mathematical Sciences, Qufu Normal University, Qufu 273165, Shandong, China


Introduction
Multi-agent systems (MAS) have been widely used in various disciplines such as unmanned vehicles, wireless sensor networks, and communication networks in the past decade. For example, every satellite in GPS is an agent, and the whole GPS is a multiagent system. Information can be exchanged between them, and information can be transmitted to the ground to guarantee accurate positioning. The consensus problem is a fundamental issue for MAS because of its wide applications in formation control, distributed estimation, and congestion control. In fact, consensus tracking over networks indicates that outputs of all agents track a given objective synchronously. We note that abrupt changes of states may exist at some time instants in biological and physical systems. For example, the migration of birds is subject to abrupt changes due to harvesting and diseases. For this scenario, MASs with impulse can well describe the inevitable interference during the actual system operation. When GPS suffers from solar storm and other external interference, their trajectory may shift, which is a pulse phenomenon. This paper only discusses the case of instantaneous pulse; that is, the time of pulse generation is very short compared with the whole process. To study the problem of uniform tracking of impulsive MAS is to study whether the agents can return to the predetermined trajectory through the information exchange after being disturbed by external environments. In this regard, Cui conducted related research in [6]. However, very few existing papers considered the consensus problem of MASs with impulse, for examples, [8,14,16,21,32,35,36,38], in the conventional consensus framework. In addition, impulsive control approach is advantageous in simplicity and flexibility for such kind of systems because the standard continuous state information is not required. As a consequence, this approach has been offered to study uniform tracking problem [9-11, 15, 27, 28, 31, 39] and adaptive consistency and synchronization problems [5,7,[22][23][24][25]29] for MASs.
For a robot performing a trajectory tracking task over a finite time interval, iterative learning control (ILC) uses the error information measured during the previous or previous operations to correct the control input, such that the operation performance can be improved along the iteration axis. Consequently, the desired trajectory can be precisely tracked over the entire time interval by the inherent mechanism of learning. ILC was first proposed in [2] for a robot, whereas Ahn and Chen [1] applied ILC to the consensus tracking of a MAS. Recently, ILC laws have been extensively studies for various types of MASs such as fractional order MAS [4, 17-20, 26, 33, 34, 37]. Note that MASs with impulse can generate discontinuous inputs, thus it is still challenging to consider whether ILC can be successfully applied to collect the sampled error data from each agent and track continuous or discontinuous trajectory, i.e., achieving leader-following consensus for nonlinear dynamics of MAS with impulse. In addition, [12,13] use Lyapunov stability theory to analyze the coordination performance of MAS.
In consideration of all above discussions, we address the application of learning type consensus tracking algorithms for MASs in this paper. In particular, we use D-type and PD type ILC laws to derive the formation tracking performance of impulse MAS under a fixed topology. The D-type ILC update law refers to a differential learning law, which uses the derivative of error signals from the previous iteration to correct the input signals for the next iteration. The PD-type ILC update law is the superposition of a proportional learning law and a differential learning law. It uses error signals from the last iteration and their derivatives to correct the input signals for the next iteration. A fundamental challenge in this paper is how to design an effective ILC by using information of the tracked trajectory and the specified agent's neighbors. This challenge is resolved by providing flexible control inputs according the changes of system states at fixed points. The output can be used to track a piecewise continuous trajectory by using continuous-time topology connections involving some instantaneous information exchanges.
The rest of the paper is organized as follows: Section 2 provides the problem formulation and preliminaries. Section 3 provides the main results of this paper. An illustrative example is presented in Section 4.

Preliminaries and notation
Consider a weighted directed graph composed of set of vertices V = {1, 2, 3, . . . , N }, N represents the number of agents in the system, the set of edges E ⊆ V × V , and the adjacency matrix Z. Set Q = (V, E, Z). V represents the set of multi-agents. Set of edge E is composed of directed sequence pairs (i, j), where (i, j) means that agent i can pass information to agent j, that is, i is called the parent node of j, and j is called the child node of i. All the sets adjacency with the i agent are called the adjacency sets of the i agent denoted as M i = {j ∈ V | (j, i) ∈ E}. Z = (z i,j ) N is the weighted adjacency matrix of Q, which is composed of nonnegative elements z i,j . In particular, z i,i = 0; if (j, i) ∈ E, z i,j = 1, it is means that agent j can pass information to agent i; if (i, j) / ∈ E, z i,j = 0, it is means that agent j can not pass information to agent i. The Laplace operator of Q is defined as If a directed graph has one node that has no parent and all other nodes have only one parent, the directed graph is called a spanning tree.
In this paper, a is used to represent the 2-norm of vector a, and A is used to represent the matrix norm compatible with it. The λ-norm of the function v is expressed as v λ : [0, α] → R n and v λ = sup t∈[0,α] e −λt v(t) , λ > 0.
The standard Kronecker product is defined as Consider a system with N agents, each agent with T pulse points. Q = (V, E, Z) represents their interaction topology. The ith agent is controlled by the following nonlinear impulsive systems:Ẋ This system is right-continuous, where X i ∈ R n is the state vector of the ith agent, u i ∈ R p is the control function of the ith agent, B is R n×p http://www.journals.vu.lt/nonlinear-analysis matrix, y i ∈ R m is the output vector of the ith agent, (·, ·) : [0, α] × R n → R n and M t : R n → R n are continuous, C(τ ) is a continuous R m×n matrix function. Impulsive time sequence is denoted by 0 < τ 1 < τ 2 < · · · < τ T < α. X (τ + t ) = lim h→0 + X (τ t +h) and X (τ − t ) = X (τ t ) represent the right and left limits of X (τ ) at τ = τ t , respectively. We need the following conditions: (H1) (·, ·) satisfies the Lipschitz condition for any τ ∈ [0, α] and X i+1,j , for any x, y ∈ R n .
Under assumptions (H1) and (H2), following [30, Remark 4.1], system (1) with X (0) = X 0 has a unique solution in a piecewise continuous functions space Let y d (τ ) be the expected consistent trace of the MAS on the time interval τ ∈ [0, α], 0 < α < ∞. Here, y d (τ ) is not necessarily continuous on the whole time interval [0, α]. We regard the desired trajectory y d (τ ) as the virtual leader in the communication topology and mark it with vertex 0. Then, the information exchange among agents can be represented by an extended communication topology graph Q * = (V ∪ {0}, E * , A * ), where E * represents the edge set, and A * represents the weighted adjacency matrix. The control objective is to design appropriate iterative learning laws such that the output of all agents can asymptotically converge to the desired trajectory y d (τ ).

Controllability results
We use the symbol σ i,j (τ ) to represent all the information received by the jth agent in the ith iteration. Then, it can be expressed as the sum of the information transmitted from other agents to the jth agent and the possible information transmitted from the leader to the jth agent The jth agent can get information directly from the desired trajectory. That is, if (0, j) ∈ E * , then d j = 1; otherwise, d j = 0. Where the first subscript of σ and y indicates the number of iterations, and the second subscript indicates the sequence number of the agent. The subscripts of z and d are explained in Section 2. The derivative of the σ i,j (τ ) function is defined as follows: In order to make the intelligent body track the target trajectory iteration number increases, the following D-type learning laws are employed: where P (τ ) is a R p×p matrix function and is differentiable during the interval [0, α]. The initial state learning rule is as follows: Set ψ i,j (τ ) as the tracking error of the agent; that is, The learning law (3) can be written as We set all involved quantities of all agents of arbitrary iteration into vector form as T is the transpose of (·). Then, (5), (6), and (7) can be written as follows: To study the multi-agent consensus problem with pulse points, (H1), (H2) and the following assumptions are necessary in this paper. Assumption 1. The desired trajectory y d is trackable; that is, there exists a state X d satisfies y d = CX d .
Theorem 1. Consider the multi-agent system (1) based on fixed topology communicate with (H1), (H2), and Assumption 1 holding, and apply the D-type learning control law (5) and the initial state learning rule (6). As the iteration number approaches infinity, the tracking error ψ i (τ ) converges to zero, i.e., lim i→∞ y i,j (τ ) = y d (τ ) for all τ ∈ [0, α] if the desired trajectory has a path to any follower agent and where Φ defined by (10).
It should be noted that each iteration will update the parameters of the entire system, and the value range of the system's independent variable τ is bounded, but the number of iterations is not be limited. In other words, the convergence meaning here indicates that a pointwise convergence over the entire time interval as the iteration number increases to infinity.
Proof. The tracking error of the jth agent in the (i + 1)th iteration is Set y i (τ ) = (y i,1 (τ ) T , y i,2 (τ ) T , . . . , y i,N (τ ) T ) T , then and From (4) it can be known that where F (X i , s) = ( (X i,1 , s) T , (X i,2 , s) T , . . . , (X i,N , s) T ) T , and I N is an N × N identity matrix.
The proof is completed.
http://www.journals.vu.lt/nonlinear-analysis Further, we consider the PD-type learning law where P (τ ) and Q(τ ) are p × p matrix functions and differentiable during the interval [0, α]. The initial state learning rule is as follows: From above one has the following result.
Theorem 2. Consider the multi-agent system (1) based on fixed topology communicate with (H1), (H2), and Assumption 1 holding, and apply the PD-type learning control law (27) and the initial state learning rule (28). As the iteration number approaches infinity, the tracking error ψ i (τ ) converges to zero, i.e., lim i→∞ y i,j (τ ) = y d (τ ) for all τ ∈ [0, α] if the desired trajectory has a path to any follower agent and where Φ defined by (10).
The proof is completed.

An example
We can use the following procedures to carry out computer simulation experiments: Step 1. Give the expression of the target trajectory y d , the expression of the multiagent system (1), and the initial parameters of the D-type or PD-type learning laws.
Step 2. Generate the system output y i .
Step 3. Calculate the tracking error ψ and its norm ψ . If ψ < , the program ends. If ψ , go to Step 4. Here, is a given positive real number.
Step 4. Update the input according to the learning law using tracking errors and the communication topological relationship between agents, then go to Step 2.
We consider the following MAS consisting of five agents: for all i ∈ V , τ ∈ [0, 6], where X i,1 represents the first state of the ith agent, and X i,2 represents the second state. Initial value as follows: The communication topology is shown in Fig. 1, where 0 represents the leader. According to Fig. 1, the Laplace matrix is and D = diag(1, 2, 2, 1). The target trajectory, i.e., the trajectory of vertex 0, is as follows: Here, y d1 and y d2 represent the first and second dimension of the target trajectory, respectively. The D-type learning control law is where u 1 (τ ) = [0, 0] T . Φ(C, B, P ) = 0.8918 < 1, which satisfies the condition of Theorems 1 and 2. Therefore, the multi-agent system can uniformly track the target trajectory under the given learning control. Figures 2 and 3 show that the error between the output value and the target trajectory gradually converges to 0 (both D-type and PDtype). Figures 4-7 show the iterative learning process of two output trajectories with D-type learning law. Figures 8-11 show the iterative learning process of two output trajectories with PD-type learning law. Figure 12 shows the iteration profile of the initial values.
As the number of iterations increases, the output trajectory gradually converges to the desired trajectory. When the iteration reaches 250th, the consensus errors of P-type learning law and PD-type learning law are shown in Table 1.            As can be seen from the table, when the number of iterations reaches 250, the system's convergence error under the control of the D-type learning law is significantly smaller than the PD-type learning law. For this numerical example, a more complex learning law does not necessarily lead to better control effect. However, we should remind that the introduction of a proportional term may help stabilize the system dynamics.

Conclusion
To solve the problem of uniform tracking of impulsive MAS, this paper uses two kinds of iterative learning laws to control the system and finds sufficient conditions for the system to converge to the target trajectory under the control of two kinds of learning laws respectively. The conditions show that when the initial parameters of the system meet certain conditions, we can adjust the initial parameters of the learning law. After finite iterations, the error between the output and the target trajectory can be sufficiently small. Compared with the single agent, MAS can exchange information between agents, which can better ensure the effectiveness of tracking. Compared with the continuous system, a pulse system is more general and more in line with real cases. Finally, a numerical example is given to demonstrate the effectiveness of the conclusion. Furthermore, we will construct a fractional iterative learning law to control the impulsive MAS and study its consistency tracking.