Real-Time Optimization

Modifier Adaptation with Bayesian Optimization using EIC acquisition

Preliminary thesis results

Problem Description

The Williams-Otto CSTR is a benchmark process for real-time optimization (RTO) systems. It consists in the following reactions:

$ A + B \rightarrow C $ (1)

$ B + C \rightarrow P + E $ (2)

$ C + P \rightarrow G $ (3)

where A and B are raw materials, P is the desired product, E is the byproduct with added sales value, C is the complex intermediate without sale value, and G is the residual material. The reactor is fed by two reactant flow rates (Fa and Fb). The full set of mass balance equations can be found in [1], for example.

Each reaction has its rate $k$ defined as a function of the reactor temperature Tr:

$ k_1 = 1.6599\times10^6\exp(-6666.7/Tr) $

$ k_2 = 7.2177\times10^8\exp(-8333.3/Tr) $

$ k_3 = 2.6745\times10^{12}\exp(-11111/Tr) $

Fa assumes a fixed, therefore Fb and Tr are our manipulated (decision) variables, $u=[Fb, Tr]$. The optimization objective is finding a combination of these variables in order to maximize the profit when the system reaches steady-state operation:

$ \Phi(u) = 1043.38*Xp*(Fa+Fb)+20.92*Xe*(Fa+Fb) - 79.23*Fa - 118.34*Fb$

where $Xp$ and $Xe$ are the amount of P and E in steady-state operation, respectively. Besides steady-state operation, the following operational constraints are enforced:

$Fb \in [3,6]$

$Tr \in [70, 100]$

$Xa < 0.12$

$Xg < 0.08$

In practice, however, it is very unlikely to know the complete model of a given process. It can be in the form of parametrical and/or structural mismatch. Parametrical mismatch means we know the model structure, but its parameters are uncertain. Structural mismatch is the case where we the complete model equations are unknown, which usually happens when trying to model complex processes and have to apply simplifications to some of its behaviors. It is easy to see that if we try to optimize a process based on a uncertain model, we can produce sub-optimal or even unfeasible operating points. This is where RTO systems come to the rescue: they are specifically designed to handle model uncertainty and lead to process optimality under convergence.

For the Williams-Otto CSTR we know all the process equations and parameters. Therefore, to study the real-time optimization capabilities, we will use a simplified version of this model in order to simulate what would happen in real applications. For that, we consider the following model structure:

$ A + 2B \rightarrow P + E $ (1)

$ A + B + P \rightarrow G $ (2)

And the reaction rates are given by:

$ k_1 = 1.655\times10^8\exp(-8077.6/Tr) $

$ k_2 = 2.611\times10^{13}\exp(-12438.5/Tr) $

Analysis

The real process will be called the "plant", while the uncertain one is the "model". First, we create and instance for each one:

Next, let's have a look at the decision surface of our cost function, along with the constraints. This is a way to observe the effects of the plant-model mismatch. First we do a grid search over the input domain, storing the objective and constraint values.

Now, the contour plot is generated for both the plant and model. The feasible region is limited by the red surface.

From this chart we notice these interesting information:

  1. The unconstrained minima of plant and model and very different. Optimizing over the model would definitely result in a sub-optimal operating point.
  2. When constraints are added, the optima are located at their intersection. This could lead to unfeasible operation if optimazing over the model, if not properly evaluated beforehand.

Let's have a deeper look at item (2). In order to do that, we should solve the optimization problem described previously, for both the plant and model. The Differential Evolution algorithm will be used, because of it's nice capabilities for handling restrictions and global convergence.

Although the solution found for both the model and plant are very close, let's see what happens if we apply the input found with the model into the plant. The 'x' represents the model solution, while the "*" the plant optimum.

Notice that although the solution remained inside the feasible region, it is definitely sub-optimal. It is easy to notice that depending on the model parameters, the optimal solution could easily become unfeasible. Thus, one can see that some sort of adaptation of the model is necessary to ensure convergence to the actual plant optimum, while respecting the process constraints.

Real-Time Optimization

RTO systems are capable of iteratively driving the system towards the optimum in spite of plant-model mismatch. The entire system is displayed in the image below:

RTO System

Several approaches exist in the literature, and can be classified based on the adaptation strategy they use. The most relevant are the two-step (TS) and modifier adaptation (MA). TS employs as adaptation strategy the update of the model parameters by solving a data reconciliation problem based on measured plant data. On the other hand, modifier adaptation uses plant data to add the so-called modifiers to the cost and constraint functions of the model-based optimization problem. Each method has its pros and cons, and their usage will depend on the process being optimized. Here I am going to use first the Modifier Adaptation with Gaussian Processes (GP) scheme.

Modifier Adaptation with Gaussian Processes

This is a novel technique that leverages GP regression to represent the plant-model mismatch. It removes the need of calculating plant gradients, a requirement for previous MA schemes. The diagram below display how it works in practice:

MA-GP

The RTO system is run with the following parameters:

This experiment is repeated 10 times, given the stochastic nature of the RTO system.

Since we know the plant's optimal cost and input signal, we can evalute the system performance based on the relative optimality gap:

$\Delta u\% = \left\|100 \frac{u - u^{opt}}{u^{opt}} \right\|, \quad \Delta \phi \% = 100 \frac{\phi - \phi^{opt}}{\phi^{opt}} $

Before running the RTO system in closed-loop, first we collect some initialization data. After that, we have to build the two main elements of such system:

After that, we pass these objects to the RTO class, which will be responsible for running the system in closed loop, as shown in the previous diagrams. Internally, it runs the following steps:

  1. Run the model-based optimization problem
  2. Filter the calculated optimal input using an exponential filter
  3. Apply the input to the plant and collect the necessary measurements
  4. Execute the adaptation strategy

All the RTO iteration data is stored in a sqlite database. This makes sure we have everything necessary for analyzing its performance as a whole later. This notebook uses a memory datbase, but one can also use a physical one.

Finally, we can run the RTO system. The last sampled initial operating point is used to start the system.

It is also possible to look at the operating points calculated at each RTO iteration. The initial dataset is represented by the pink diamonds, while the closed-loop iterations are in black. Notice that the system rapidly converges to the plant optimum.

Effect of Noise

The results above are for the scenario where we have no noise in the plant measuremets. Since this is not the reality, an interesting test is to check how it can impact the RTO performance. For that, we consider a 0.01 additive gaussian noise, but using the same parameters as the previous system.

Using different initial data points

We start to see more variation, but results are still equivalent for both cases. What if we initialized the system at different starting points?

As seen in the images above, for the Williams-Otto problem, there is a very small effect on using different points for initialiazing the RTO system. Although there is some variation introduced by the measuremente noise,the system easily converges under both optimization algorithms. It is important to notice that this may not be entirely true if the problem exhibits more non-convexity, as investigated in [YY].

EIC acquisition function

There is a clear link between RTO and Bayesian optimization. Both problems try to optimize and unknown and expensive-to-evaluate function by sampling it according to some heuristics. The main difference lies in the fact that the Bayesian framework is purely data-drive, while in RTO we have a model that, although imperfect, captures the main behavior of the unknown function. Notice that this works as some sort of prior knowledge, which in the Bayesian frameork could be historical samples of the function to be optimized, for example.

Considering the MA-GP scheme, one can see that the RTO system learns the plant-model mismatch using GPs and the prior knowledge of the model is used to drive plant optimization. In the Bayesian framework, this is achieved with the use of so-called acquisition functions. Therefore, one could ask the following question: what if we included the acquisition function in RTO procedure? Could this improve the system's performance?

This was already done by [XX], which applied an EI acquisition to the objective, but used derivative-free and trust-region concepts to handle constrained problems. The idea behind this is to reduce or increase the trust-region size based on thje discrepancy between model estimations and plant measurements. But, one can clearly see that embedding a fully constrained Bayesian framework could be easily achieved. We can change the model-based optimization problem to use the EIC acquisition function, since we have GPs already trained on the constraints mismatch. This turns our constrained problem into a unconstrained one, using a barrier-like objective function.

First, let's run the optimization problem with the EIC acquisition function, using the same default parameters for GP learning. We also run MA-GP as a baseline to compare the results. The last sampled initial operating point is used to start the system. For now, only the noiseless case is considered, to avoid unexpected effects.

Optimizer Choice

Comparison with MA-GP

As seen before, SQP is unable to even find properly optimizer the EIC function. Therefore, we continue the remainder of this tutorial considering the DE algorithm as the default optimizer. Now, we comparte the results classic MA-GP scheme and with EIC acquisition function, considering the same experimental conditions.

The results for the cost and input optimality gap are displayed below. Notice that using the proposed Bayesian framework, we achieve similar results as the MA-GP baseline. However, the Bayesian approach presents more variability during the optimization steps, we could be related to the way we penalize the constraints violations. As seen previously, the optimal point will lie exactly at the intersection of two constraints, which poses a challenge for the RTO system.

Another angle is to understand how the EIC acquisition function balances exploration and explotation. Maybe there is some exploitation happening when it shouldn't.

Having a further look at the constraints, we can observer an interesting situation: using the EIC objective function, there is a high penalty for violating constraints. Whenever any of them are violated, there next solution found is biased towards this result, causing the system to be driven away from the unfeasible region. This is an interesting property, but it can prevent it from converging to the acutal plant optimum, specially if it lies at non-empty set of active constraints.

Investigating the EIC decision surface

We previously saw that the proposed Bayesian framework has some variation during the RTO iterations that caused its deviation from the plant optimum. In the next analysis, we plot some of the iterations in order to understand how the algorithm is driving the optimization process. For that, the EI and EIC functions' decision surface will be investigated.

First iteration

When the RTO is started, the Bayesian problem has the following shape. In this step, there is only knowledge of the random initialization points, meaning our knowledge about the system limited. If we considered the EI function only, notice that we would find a solution that is very close to unconstrained optimum. Since we need to add the constraints, we change its shape entirely, leading to the EIC function in the last subplot. We start observing these properties:

  1. With very limited knowledge of the system, the Bayesan approach is able to have a good description of both the objective and constraints. However, as seen in zoomed in chart, the EIC optimum is still far from the actual plant optimum.
  2. The problem becomes much harder to be solved, and we start seeing extra non-convexity. Increasing the chart scale, notice the EIC shape compelxity and that outside this region, the problem is mostly flat. This is definitely troublesome for deterministic algorithms, since we are very likely to get stuck if initialized at the flat area. This probably explains why the SQP algorithm was unable to drive the RTO system to new solutions.

Given this backgound, we analyze next some of the other solutions found by the RTO system. Some key iterations were selected, based on the previous charts.

From the previous charts, we extract the following interesting insights:

  1. On some iterations, the problem is extremely hard to solve because of its non-convexity, and even DE is unable to properly find the EIC optimum due to numerical issues. For instance, in the last iteration displayed it converges to a solution far from the optimum and we notice that a better solution remained close to the plant optimum, but it is basically a point. This would be very hard to find, requiring a lot of computational effort and also adjustment of its parameters.
  2. It is also possible in some iterations that the EIC function is mostly flat, meaning there is not actually a good improvement over the entire search space. This suggests some improvements of the algorithm into this direction, as the solution will be a random point and can probably degrade the system performance.
  3. A question that arises immediately is: why does the Bayesian approach can come close to the plant optimum, but does not converge like MA-GP? It we analyze the EIC function, we notice that once we are close the optimum, there is not much improvement over the objective function, and an exploration of the feasible region is favored over exploitation, since it adds more information. This is a know property of the Bayesian framework, and we can spot some iterations where we actually gain more understanding of unexplored areas near the constraints.

If the target is to converge the RTO to the plant optimum as quickly as possible, then the proposed Bayesian approach needs a better way to control the exploitation. However, this exploitation approach can be benefitial depending on the problem properties. For instance, if we had disjoint feasible regions containing an optimal and sub-optimal solution. Would MA-GP be able to properly find the global optimum if initialized at a local one? Would tge Bayesian exploitation approach be able to leverage its exploitation behavior in that scenario?

Effect of initialization point EIC

As a last experiment, we analyze how the choice of initial operating points could affect the EIC RTO performance. The same initial points from the MA-GP experiment are used, and both approaches are compared.

From the images above, we notice that both approaches have similar convergence behavior. There is not much difference between them regarding convergence, although one can notice that MA-GP exhibits less variation near the system convergence. This is possibly an effect of the EIC properties discussed proviously.

Conclusion

This report shows several aspects of the MA-GP RTO problem, as well as the results of a proposed approach using constrained Bayesian optimization with the EIC acquisition function. Using the Williams-Otto reactor, we start showing how the choice of optimiation algorithm and initial points can affect the RTO performance. After that, we start analyzing the novel approach results, using the MA-GP scheme as a baseline. Interesting conclusion could be drawn from this analsys, such that:

Therefore, as a next step, we aim to investigate different instances of the Williams-Otto problem, specially with non-convex feasible regions. This is a way to validate the limits of MA-GP, and also if the EIC exploration can become a valuable property under this situation.