DE102024202197A1

DE102024202197A1 - Controlling a vehicle with estimation of a control-based time delay by a feedforward controller

Info

Publication number: DE102024202197A1
Application number: DE102024202197.0A
Authority: DE
Inventors: Amardeep Mishra; Michael Fleps-Dezasse; Lothar Kiltz
Original assignee: ZF Friedrichshafen AG
Current assignee: ZF Friedrichshafen AG
Priority date: 2024-03-08
Filing date: 2024-03-08
Publication date: 2025-09-11

Abstract

Ein Verfahren (600) mit einem Vorwärtsregler (165) zur Steuerung, insbesondere Lenkung, eines Fahrzeugs (110), insbesondere zum autonomen Fahren, wobei das Verfahren (600) folgende Schritte umfasst: Eingabe (605) eines Satzes an Bahnkrümmungen (220), die jeweils eine vom Fahrzeug (110) zu fahrende Bahn beschreiben, in den Vorwärtsregler (165); Abschätzung (610) einer regelungsbasierten Zeitverzögerung durch den Vorwärtsregler (165); Auswahl (615) einer Vorschaubahnkrümmung (235) aus dem Satz an Bahnkrümmungen (220) basierend auf der abgeschätzten regelungsbasierten Zeitverzögerung und eines Fahrparameters des Fahrzeugs (110); Ableitung (620) einer Stellgröße des Vorwärtsreglers (165) aus der Vorschaubahnkrümmung (235) und Ausgabe (625) der Stellgröße des Vorwärtsreglers (165), wobei die Stellgröße des Vorwärtsreglers (165) zur Steuerung des Fahrzeugs (110) verwendet wird. A method (600) with a feedforward controller (165) for controlling, in particular steering, a vehicle (110), in particular for autonomous driving, wherein the method (600) comprises the following steps: input (605) of a set of path curvatures (220), each describing a path to be traveled by the vehicle (110), into the feedforward controller (165); estimation (610) of a control-based time delay by the feedforward controller (165); selection (615) of a preview path curvature (235) from the set of path curvatures (220) based on the estimated control-based time delay and a driving parameter of the vehicle (110); Derivation (620) of a control variable of the feedforward controller (165) from the preview path curvature (235) and output (625) of the control variable of the feedforward controller (165), wherein the control variable of the feedforward controller (165) is used to control the vehicle (110).

Description

Die vorliegende Erfindung betrifft ein Verfahren zur Steuerung, insbesondere zur Lenkung eines Fahrzeugs, insbesondere zum autonomen Fahren, mit einem Regler, insbesondere einem Vorwärtsregler (feedforward controller). Die vorliegende Erfindung bezieht sich ferner auf ein computerimplementiertes Verfahren, auf ein Datenverarbeitungssystem, auf ein Computerprogramm und/oder auf ein computerlesbares Medium.The present invention relates to a method for controlling, in particular steering, a vehicle, in particular for autonomous driving, using a controller, in particular a feedforward controller. The present invention further relates to a computer-implemented method, a data processing system, a computer program, and/or a computer-readable medium.

Moderne Fahrzeuge, wie z.B. Pkw, Lkw, Busse, und dergleichen Fahrzeuge, weisen oftmals ein regelungstechnisches System auf, das das manuelle Fahren unterstützt und/oder sogar ersetzt. Das System kann als autonomes Fahrsystem betrachtet werden, wobei autonom als teilautonom, d. h. nur den Fahrer unterstützend, oder vollständig autonom, d. h. den Fahrer ersetzend, verstanden werden kann.Modern vehicles, such as cars, trucks, buses, and similar vehicles, often feature a control system that supports and/or even replaces manual driving. This system can be considered an autonomous driving system, where "autonomous" can be understood as either partially autonomous, i.e., only supporting the driver, or fully autonomous, i.e., replacing the driver.

Die Ausgabe des regelungstechnisches Systems kann zur Lenkung des Fahrzeugs verwendet werden. Es liegt in der Natur von regelungstechnischen Systemen, dass die Regelung eine gewisse Zeit benötigt, um die Stellgröße einzustellen. Zudem können je nach Regelstrategie während der Regelung Schwingungen der Stellgröße auftreten. Heutige Systeme nach dem Stand der Technik müssen so ausgelegt sein, dass sie schnell sind, d.h. eine enge Nachführung der Stellgröße aufweisen, und/oder eine komfortable Fahrt für die Fahrgäste ermöglichen.The output of the control system can be used to steer the vehicle. It is in the nature of control systems that the control system requires a certain amount of time to adjust the manipulated variable. Furthermore, depending on the control strategy, oscillations in the manipulated variable can occur during control. Today's state-of-the-art systems must be designed to be fast, i.e., to closely track the manipulated variable, and/or to enable a comfortable ride for passengers.

Es ist daher eine Aufgabe der vorliegenden Erfindung, ein Verfahren zur Steuerung eines Fahrzeugs bereitzustellen, das schnell, insbesondere schnell nachverfolgend, und/oder komfortabel für die Fahrgäste ist. Es ist ferner Aufgabe der vorliegenden Erfindung, ein computerimplementiertes Verfahren, ein Datenverarbeitungssystem, ein Computerprogramm und/oder ein computerlesbares Medium bereitzustellen, das schnell, insbesondere schnell nachverfolgend, und/oder komfortabel für die Fahrgäste ist.It is therefore an object of the present invention to provide a method for controlling a vehicle that is fast, in particular fast-tracking, and/or comfortable for the passengers. It is also an object of the present invention to provide a computer-implemented method, a data processing system, a computer program, and/or a computer-readable medium that is fast, in particular fast-tracking, and/or comfortable for the passengers.

Die Erfindung löst diese Aufgabe mittels der Gegenstände der unabhängigen Ansprüche. Unteransprüche geben bevorzugte Ausführungsformen wieder.The invention solves this problem by means of the subject matter of the independent claims. Subclaims specify preferred embodiments.

Ein Verfahren zur Steuerung, insbesondere Lenkung, eines Fahrzeugs, insbesondere zum autonomen Fahren, mit einem Vorwärtsregler; umfasst eine Eingabe eines Satzes an Bahnkrümmungen, die jeweils eine vom Fahrzeug zu fahrende Bahn beschreiben, in den Vorwärtsregler; ferner eine Abschätzung einer regelungsbasierten Zeitverzögerung durch den Vorwärtsregler; ferner eine Auswahl einer Vorschaubahnkrümmung aus dem Satz an Bahnkrümmungen basierend auf der abgeschätzten regelungsbasierten Zeitverzögerung und eines Fahrparameters des Fahrzeugs; ferner eine Ableitung einer Stellgröße des Vorwärtsreglers aus der Vorschaubahnkrümmung und Ausgabe der Stellgröße des Vorwärtsreglers, wobei die Stellgröße des Vorwärtsreglers zur Steuerung des Fahrzeugs verwendet wird.A method for controlling, in particular steering, a vehicle, in particular for autonomous driving, with a feedforward controller; comprises inputting a set of path curvatures, each describing a path to be traveled by the vehicle, to the feedforward controller; furthermore, estimating a control-based time delay by the feedforward controller; furthermore, selecting a preview path curvature from the set of path curvatures based on the estimated control-based time delay and a driving parameter of the vehicle; furthermore, deriving a manipulated variable of the feedforward controller from the preview path curvature and outputting the manipulated variable of the feedforward controller, wherein the manipulated variable of the feedforward controller is used to control the vehicle.

Die regelungsbasierten Zeitverzögerung ist beispielsweise die Zeitverzögerung, die in einem Regler, beispielsweise im Vorwärtsregler und/oder in weiteren Reglern auftritt. Durch die Abschätzung der regelungsbasierten Zeitverzögerung und durch die Weiterverwendung der Abschätzung, insbesondere für die Auswahl der Vorschaubahnkrümmung, kann im Vorregler genau die Bahnkrümmung, nämlich die Vorschaubahnkrümmung ausgewählt werden, die ein schnelles und/oder komfortables Regeln ermöglicht. Ein Satz an Bahnkrümmungen umfasst beispielsweise mehrere Bahnkrümmungen, insbesondere eine Sequenz an Bahnkrümmungen. Ein Fahrparameter kann beispielsweise eine Fahrgeschwindigkeit, eine Beschleunigung, eine Verzögerung, und/oder eine Gierrate, des Fahrzeugs sein. Vorzugsweise wird die Vorschaubahnkrümmung ausgewählt basierend auf der abgeschätzten regelungsbasierten Zeitverzögerung und der Fahrgeschwindigkeit des Fahrzeuges. Nach Auswahl der Vorschaubahnkrümmung wird vorzugsweise die Stellgröße ausgegeben, die beispielsweise das Fahrzeug veranlasst, einer gewünschten Fahrtrajektorie, insbesondere Kurvenkrümmung, schnell und/oder komfortabel zu folgen.The control-based time delay is, for example, the time delay that occurs in a controller, for example in the feedforward controller and/or in other controllers. By estimating the control-based time delay and reusing the estimate, in particular for selecting the preview trajectory curvature, the precise trajectory curvature, namely the preview trajectory curvature, that enables fast and/or comfortable control can be selected in the pre-controller. A set of trajectory curvatures comprises, for example, several trajectory curvatures, in particular a sequence of trajectory curvatures. A driving parameter can be, for example, a driving speed, an acceleration, a deceleration, and/or a yaw rate of the vehicle. The preview trajectory curvature is preferably selected based on the estimated control-based time delay and the driving speed of the vehicle. After selecting the preview trajectory curvature, the manipulated variable is preferably output, which, for example, causes the vehicle to quickly and/or comfortably follow a desired driving trajectory, in particular a curve curvature.

Vorzugsweise wird die regelungsbasierte Zeitverzögerung unter Verwendung eines maschinellen Lernalgorithmus, insbesondere eines neuronalen Netzes, abgeschätzt. Zweckmäßig wird Fahrparameter als eine Eingangsgröße für den maschinellen Lernalgorithmus, insbesondere das neuronale Netz, verwendet. Vorzugsweise umfasst der maschinellen Lernalgorithmus, insbesondere das neuronale Netz, eine radiale Basisfunktion. Vorteilhaft umfasst der maschinellen Lernalgorithmus, insbesondere das neuronale Netz, ein Updategesetz. Insbesondere durch die Verwendung eines maschinellen Lernalgorithmus kann die regelungsbasierte Zeitverzögerung besonders präzise und/oder schnell und/oder komfortabel erfolgen. Vorteilhaft wird im Vorwärtsregler eine radiale Basisfunktion verwendet, insbesondere um die regelungsbasierte Zeitverzögerung abzuschätzen. Vorzugsweise wird basierend auf der abgeschätzten regelungsbasierten Zeitverzögerung die Vorschaukrümmung, insbesondere die Vorschaubahnkrümmung, der vor dem Fahrzeug liegenden Strecke erzeugt und/oder ausgewählt.Preferably, the control-based time delay is estimated using a machine learning algorithm, in particular a neural network. Driving parameters are expediently used as an input variable for the machine learning algorithm, in particular the neural network. Preferably, the machine learning algorithm, in particular the neural network, comprises a radial basis function. Advantageously, the machine learning algorithm, in particular the neural network, comprises an update law. In particular, by using a machine learning algorithm, the control-based time delay can be particularly precise and/or fast and/or convenient. Advantageously, a radial basis function is used in the feedforward controller, in particular to estimate the control-based time delay. Preferably, based on the estimated control-based time delay The preview curvature, in particular the preview path curvature, of the route ahead of the vehicle is generated and/or selected.

Vorzugsweise ist ein Rückkopplungsregler (feedforward controller) vorgesehen. Vorteilhaft wird eine Ist-Gierrate des Fahrzeugs, insbesondere mit einem Gierratensensor, gemessen. Vorteilhaft wird ein Gierratenfehler aus der Differenz der Ist-Gierrate und einer Soll-Gierrate ermittelt. Zweckmäßig wird der Gierratenfehler im Rückkopplungsregler verarbeitet. Vorteilhaft gibt der Rückkopplungsregler eine Stellgröße des Rückkopplungsreglers aus. Vorteilhaft wird die Stellgröße des Rückkopplungsreglers, insbesondere zusammen mit der Stellgröße des Vorwärtsreglers, zur Steuerung, insbesondere zur Lenkung, des Fahrzeugs verwendet. Vorzugsweise schätzt der Vorwärtsregler bei der Abschätzung der regelungsbasierten Zeitverzögerung die Regelungszeit des Rückkopplungsreglers mindestens teilweise, insbesondere vollständig ab. Vorzugsweise berücksichtigt der Vorwärtsregler bei der Abschätzung der regelungsbasierten Zeitverzögerung die Regelungszeit des Rückkopplungsreglers mindestens teilweise, insbesondere vollständig. Damit kann beispielsweise die gesamte regelungsbasierte Zeitverzögerung, insbesondere die des Vorwärtsreglers und/oder Rückwärtsreglers, abgeschätzt werden, wodurch insbesondere besonders präzise die regelungsbasierte Zeitverzögerung vorausgesagt wird. Dadurch kann eine sehr gute, insbesondere schnelle und/oder komfortable Regelung eingestellt werden.A feedback controller (feedforward controller) is preferably provided. Advantageously, an actual yaw rate of the vehicle is measured, in particular using a yaw rate sensor. Advantageously, a yaw rate error is determined from the difference between the actual yaw rate and a desired yaw rate. The yaw rate error is expediently processed in the feedback controller. Advantageously, the feedback controller outputs a manipulated variable of the feedback controller. Advantageously, the manipulated variable of the feedback controller, in particular together with the manipulated variable of the feedforward controller, is used to control, in particular to steer, the vehicle. Preferably, the feedforward controller estimates the control time of the feedback controller at least partially, in particular completely, when estimating the control-based time delay. Preferably, the feedforward controller takes the control time of the feedback controller into account at least partially, in particular completely, when estimating the control-based time delay. This allows, for example, the entire control-based time delay, especially that of the forward controller and/or reverse controller, to be estimated, thereby predicting the control-based time delay with particular precision. This allows for the establishment of a very good, particularly fast and/or convenient control system.

Vorzugsweise wird die Stellgröße des Rückkopplungsreglers unter Verwendung eines maschinellen Lernalgorithmus erzeugt wird. Vorteilhaft umfasst der maschinelle Lernalgorithmus einen Online-Lernalgorithmus. Zweckmäßig umfasst der maschinelle Lernalgorithmus eine variable Lernrate. Durch die Verwendung des maschinellen Lernalgorithmus kann der Rückkopplungsregler besonders gut regeln. Die, beispielsweise auch aufgrund des maschinellen Lernalgorithmus auftretende, regelungsbasierte Zeitverzögerung wird insbesondere im Vorwärtsregler abgeschätzt und/oder berücksichtigt. Dadurch ist eine schnelle und/oder komfortable Gesamtregelung ermöglicht.Preferably, the manipulated variable of the feedback controller is generated using a machine learning algorithm. Advantageously, the machine learning algorithm includes an online learning algorithm. The machine learning algorithm expediently includes a variable learning rate. Using the machine learning algorithm allows the feedback controller to control particularly well. The control-based time delay, which may also occur due to the machine learning algorithm, is estimated and/or taken into account, particularly in the feedforward controller. This enables fast and/or convenient overall control.

Vorteilhaft umfasst der Rückkopplungsregler einen ersten Rückwärtsregler und einen zweiten Rückwärtsregler.Advantageously, the feedback controller comprises a first feedback controller and a second feedback controller.

Vorteilhaft umfasst der erste Rückwärtsregler ein neuronales Netz. Insbesondere umfasst der erste Rückwärtsregler einen integralen Verstärkungslernalgorithmus. Vorteilhaft ist der Gierratenfehler und/oder die gewünschte Gierrate eine Eingangsgröße für den integralen Verstärkungslernalgorithmus. Insbesondere ist der erste Rückwärtsregler ein integraler Reinforcement Learning-Agent (IRL-Agent). Vorzugsweise wird das kritische neuronale Netz online im Rahmen eines integralen Verstärkungselements, beispielsweise mit variabler Lernrate, insbesondere durch Gradientenabstieg eingestellt. Vorteilhaft ist die Lernrate eine parabolische Abbildung. Insbesondere verbindet die parabolische Abbildung die Lernrate mit dem momentanen Gierratenfehler. Vorteilhaft leitet der IRL-Agent einen Prozess einer Strategieiteration, insbesondere ohne anfänglich zulässige Strategie, ein. Vorteilhaft generiert der IRL-Agent eine annähernd optimale Regelung mit einem, insbesondere mit exakt einem, beispielsweise mit einem kritischen, neuronalen Netz.The first feedback controller advantageously comprises a neural network. In particular, the first feedback controller comprises an integral reinforcement learning algorithm. Advantageously, the yaw rate error and/or the desired yaw rate is an input variable for the integral reinforcement learning algorithm. In particular, the first feedback controller is an integral reinforcement learning (IRL) agent. The critical neural network is preferably adjusted online as part of an integral gain element, for example with a variable learning rate, in particular by gradient descent. Advantageously, the learning rate is a parabolic mapping. In particular, the parabolic mapping links the learning rate to the instantaneous yaw rate error. Advantageously, the IRL agent initiates a strategy iteration process, in particular without an initially permissible strategy. Advantageously, the IRL agent generates an approximately optimal control with one, in particular with exactly one, for example, critical, neural network.

The update law 190 for the critic NN 185: $\hat{\dot{W}} = \underset{{e r s t e r T e r m}}{\underset{︸}{η (z) \bar{ϑ} \hat{e}}} + \frac{η (z) Ξ (z, u) (\frac{\nabla ϑ^{T}}{2} G R^{- 1} (I_{m} - B) G^{T} \nabla ϑ \hat{\dot{W}} z)}{{z w e i t e r T e r m}} + η (z) \hat{e}$ The update law 190 for the critic NN 185: $\hat{\dot{W}} = \underset{{e r s t e r T e r m}}{\underset{︸}{η (z) \bar{ϑ} \hat{e}}} + \frac{η (z) Ξ (z, u) (\frac{\nabla ϑ^{T}}{2} G R^{- 1} (I_{m} - B) G^{T} \nabla ϑ \hat{\dot{W}} z)}{{z w e i t e r T e r m}} + η (z) \hat{e}$

Wobei $\hat{e} = \int_{{t - T}}^{{t}} e^{- γ (τ - t + T)} [Q (z) + U (u)] d τ + {\hat{\dot{W}}}^{T} Δ ϑ$ der geltende HJB Näherungsfehler ist. Δϑ ≡ e^-γTϑ(z(t)) - ϑ(z(t - T)).Where $\hat{e} = \int_{{t - T}}^{{t}} e^{- γ (τ - t + T)} [Q (z) + U (u)] d τ + {\hat{\dot{W}}}^{T} Δ ϑ$ the applicable HJB approximation error is Δϑ ≡ e ^-γT ϑ(z(t)) - ϑ(z(t - T)).

Ferner gilt:

ϑ(z(t)): Rⁿ → R^N: Regressor Vektor des kritischen neuronalen Netzwerks (mit N als der Anzahl verdeckter Neuronen), der zum augmentierten (verbesserten) Zustand z(t) berechnet wurde.
Ŵ: Rⁿ → R^N: kritische NN Gewichte
η(z): die variable Lernrate rate.
ϑ̅: der normalisierte Regressionsvektor
Ξ (z, u): die geglättete Schaltfunktion
I_m: die Identitätsmatrix mit Dimension ‚m‘.
$B = d i a g {t a n h^{2} (τ_{2 i} (z))} \in R^{m \times m} : wobei τ_{2} = \frac{1}{2 u_{m}} R^{- 1} G^{T} \nabla ϑ^{T} \hat{\dot{W}} \in R^{m}$
R ∈ R^m×m: Penalty auf Steuerungen
G ∈ R^m×m: Steuerungs Kopplungsmatrix des augmentierten Systems
K₁ und K₂: Konstanten passender Dimensionen
R: die Menge der realen Zahlen
∇: Gradient
V:Rⁿ → R: die Bewertungsfunktion (Skalare Ausgabe des kritischen NN)
z = (x - x_des; x_des) ∈ R²ⁿ: der augmentierte Zustand ist, der den augmentierten Zustand aus Zustandsfehler und Zielzustand umfasst
y: Nachlassfaktor
Q = z^TQ₁z: Rⁿ → R: Aufwand (auch: Kosten, Penalty) pro Schritt des augmentierten Zustands
u_m ∈ R: maximaler Steuerungsaufwand (Sättigungsgrenze des Aktuators) $U (u) = 2 u_{m} \int_{0}^{- u_{m} tanh A (z)} {tanh}^{- 1} {(v / u_{m})}^{T} R d v$

Furthermore:

ϑ(z(t)): R ⁿ → R ^N : Regressor vector of the critical neural network (with N as the number of hidden neurons) computed to the augmented (improved) state z(t).
Ŵ: R ⁿ → R ^N : critical NN weights
η(z): the variable learning rate.
ϑ̅: the normalized regression vector
Ξ (z, u): the smoothed switching function
I _m : the identity matrix with dimension ‘m’.
$B = d i a g {t a n h^{2} (τ_{2 i} (z))} \in R^{m \times m} : wobei τ_{2} = \frac{1}{2 u_{m}} R^{- 1} G^{T} \nabla ϑ^{T} \hat{\dot{W}} \in R^{m}$
R ∈ R ^m×m : Penalty on controls
G ∈ R ^m×m : Control coupling matrix of the augmented system
K ₁ and K ₂ : constants of appropriate dimensions
R: the set of real numbers
∇: Gradient
V:R ⁿ → R: the evaluation function (scalar output of the critical NN)
z = (x - x _des ; x _des ) ∈ R ²ⁿ : is the augmented state, which includes the augmented state from state error and target state
y: discount factor
Q = z ^T Q ₁ z: R ⁿ → R: Effort (also: cost, penalty) per step of the augmented state
u _m ∈ R: maximum control effort (saturation limit of the actuator) $U (u) = 2 u_{m} \int_{0}^{- u_{m} tanh A (z)} {tanh}^{- 1} {(v / u_{m})}^{T} R d v$

Dabei sind U(z): R^m → Rdie Kosten (der Aufwand) pro Schritt der Steuerung (falls die Randbedingungen des Aktuators betrachtet werden) und es gilt: $A (z) = \frac{1}{2 u_{m}} R^{- 1} G^{T} \nabla V$ Where U(z): R ^m → R is the cost (effort) per step of the control (if the boundary conditions of the actuator are considered) and the following applies: $A (z) = \frac{1}{2 u_{m}} R^{- 1} G^{T} \nabla V$

Im oben angegebenen Aktualisierungsgesetz minimiert der erste Term den momentanen Approximationsfehler, der zweite Term kommt zum Tragen, wenn der Lyapunov-Wert $\frac{1}{2} Z^{T} Z$ entlang der Trajektorien des augmentierten Systems nicht mehr abnimmt, und der dritte Term bestimmt die Größe der Restmenge, über die der Zustandsfehler konvergiert.In the update law given above, the first term minimizes the instantaneous approximation error, the second term comes into play when the Lyapunov value $\frac{1}{2} Z^{T} Z$ no longer decreases along the trajectories of the augmented system, and the third term determines the size of the residual set over which the state error converges.

Der IRL Agent 200 ist wie folgt bestimmt: $δ = - u_{m} t a n h (\frac{1}{2 u_{m}} R^{- 1} G^{T} \nabla ϑ^{T} \hat{\dot{W}})$ The IRL Agent 200 is determined as follows: $δ = - u_{m} t a n h (\frac{1}{2 u_{m}} R^{- 1} G^{T} \nabla ϑ^{T} \hat{\dot{W}})$

Vorteilhaft umfasst der zweite Rückwärtsregler ein direkt adaptives neuronales Netz. Vorteilhaft umfasst der zweite Rückwärtsregler eine adaptive, auf einem neuronalen Netzwerk basierende (NN-basierte) Gleitmodusregelung. Eine solche Regelung kann auch ANNSMC (aus ANN: „artificial neural network“ und SMC: „sliding mode control“), NN-basierte Gleitregimeregelung oder NN-basierte Sliding-Mode-Regelung genannt werden. Vorzugsweise wird die adaptive, auf einem neuronalen Netzwerk basierende Gleitmodusregelung in Verbindung mit dem IRL-Agenten verwendet, insbesondere um die Lernlast zu verteilen und/oder um engere Kreuzspurfehler zu erzeugen. Vorteilhaft wird die Lernrate durch eine parabolische Abbildung des Gierratenfehlers bestimmt. Zweckmäßig erfolgt eine variable Kontrollbreite für die Fehlerdynamik, insbesondere durch den zweiten Rückwärtsregler.Advantageously, the second feedback controller comprises a directly adaptive neural network. Advantageously, the second feedback controller comprises an adaptive, neural network-based (NN-based) sliding mode control. Such a control can also be called ANNSMC (from ANN: "artificial neural network" and SMC: "sliding mode control"), NN-based sliding regime control, or NN-based sliding mode control. The adaptive, neural network-based sliding mode control is preferably used in conjunction with the IRL agent, in particular to distribute the learning load and/or to generate narrower cross-track errors. Advantageously, the learning rate is determined by a parabolic mapping of the yaw rate error. A variable control range for the error dynamics is expediently provided, in particular by the second feedback controller.

Die ANNSMC 205 wird nach folgender Formel aktualisiert: ${\dot{\hat{W}}}_{a n n s m c} = - γ (e_{\dot{ψ}}) ϕ (s)$ The ANNSMC 205 is updated according to the following formula: ${\dot{\hat{W}}}_{a n n s m c} = - γ (e_{\dot{ψ}}) ϕ (s)$

Wobei gilt:

w̅_annsmc NN Gewichte der radialen Basisfunktion (radial basis function RBF) eψ̇ = ψ̇ - ψ̇_des der Gierratenfehler
s: (eψ̇, ψ̇_des, v_x, v_y) Eingaben ins RBFNN
ϕ: s → Rⁿ die Ausgabe der verdeckten Schicht des RBFNN (eine Eingangsschicht - eine verdeckte Schicht und eine Ausgangsschicht), wobei n die Anzahl der Neuronen in der verdeckten Schicht ist.

Whereby:

w̅ _annsmc NN weights of the radial basis function (RBF) eψ̇ = ψ̇ - ψ̇ _{des of} the yaw rate error
s: (eψ̇, ψ̇ _des , v _x , v _y ) Inputs to RBFNN
ϕ: s → R ⁿ is the output of the hidden layer of the RBFNN (one input layer - one hidden layer and one output layer), where n is the number of neurons in the hidden layer.

Der Steuerwinkel 210 der ANNSMC ist gegeben als: $δ_{a n n s m c} = {\hat{w}}_{a n n s m c} ϕ - k e_{\dot{ψ}} - η t a n h (e_{\dot{ψ}})$ wobei $k > 0 \land η > 0$ The control angle 210 of the ANNSMC is given as: $δ_{a n n s m c} = {\hat{w}}_{a n n s m c} ϕ - k e_{\dot{ψ}} - η t a n h (e_{\dot{ψ}})$ where $k > 0 \land η > 0$

Damit wird insbesondere ein auf Zeitverzögerungskompensation basierendes Vorsteuerungsschema in Verbindung mit einer dualen neuronalen Netzrückkopplungssteuerung verwendet, die beispielsweise den spurhaltenden IRL-Agenten und die adaptive, auf einem neuronalen Netz basierende Gleitmodussteuerung zur autonomen Lenksteuerung umfasst.In particular, a feedforward control scheme based on time delay compensation is used in conjunction with a dual neural network feedback control, which includes, for example, the lane-keeping IRL agent and the adaptive neural network-based sliding mode control for autonomous steering control.

Vorteilhaft umfasst eine Steuerungsarchitektur, insbesondere ein Gesamtregler, die kinematische Vorwärtssteuerung, insbesondere den Vorwärtsregler, und einen Teil mit lernfähiger Rückkopplung, insbesondere den Rückwärtsregler. Vorteilhaft sorgt der Vorwärtsregler für eine überwachende Steuerung, während der lernbasierte Rückwärtsregler den Gierratenfehler minimiert. Vorteilhaft wurden die Auswirkungen der Zeitverzögerung in der vorwärtsregler-unterstützten dualen neuronalen Netzwerk-Rückkopplungssteuerungsarchitektur kompensiert, beispielsweise indem eine radiale Basisfunktion verwendet wurde, um die Zeitverzögerung zu einem bestimmten Zeitpunkt zu schätzen. Vorteilhaft wird die geschätzte Zeitverzögerung im kinematischen Vorwärtsregler verwendet, um aktiv ein variables Zeitfenster für die Bahnkrümmung vor dem autonomen Fahrzeug auszuwählen. Vorteilhaft berechnet die kinematische Vorsteuerung, insbesondere der Vorwärtsregler den Lenkwinkel auf der Grundlage dieser Vorschaukrümmung, insbesondere anstelle der tatsächlichen Krümmung.Advantageously, a control architecture, in particular an overall controller, comprises the kinematic feedforward control, in particular the feedforward controller, and a part with learning-based feedback, in particular the feedforward controller. Advantageously, the feedforward controller provides supervisory control, while the learning-based feedforward controller minimizes the yaw rate error. Advantageously, the effects of the time delay in the feedforward controller-assisted dual neural network feedback control architecture have been compensated, for example, by using a radial basis function to estimate the time delay at a specific time. Advantageously, the estimated time delay is used in the kinematic feedforward controller to actively select a variable time window for the path curvature ahead of the autonomous vehicle. Advantageously, the kinematic feedforward control, in particular the feedforward controller, calculates the steering angle based on this preview curvature, in particular instead of the actual curvature.

Die kinematische Vorwärtssteuerung 180 ist wie folgt bestimmt: $δ_{f f} = t a n^{- 1} (κ_{p r e v i e w} * L)$ wobei κ_preview die auf der Grundlage der geschätzten Zeitverzögerung erzeugte (ausgewählte) Vorschaukrümmung und L: der Radstand des Fahrzeugs ist.The kinematic forward control 180 is determined as follows: $δ_{f f} = t a n^{- 1} (κ_{p r e v i e w} * L)$ where κ _preview is the (selected) preview curvature generated based on the estimated time delay and L: is the wheelbase of the vehicle.

Vorteilhaft wird auf dem Rückkopplungspfad ein IRL-Agent mit einer adaptiven NNbasierten Gleitmodusregelung (ANNSMC) kombiniert, um den Gierratenfehler für autonome Fahrzeuge zu minimieren. Vorteilhaft wird die Online-Lernlast zwischen den beiden Lernverfahren aufgeteilt. Vorzugsweise versucht der IRL-Agent, annähernd optimale Steuerungsstrategien zu generieren, um den Hamilton-Jacobi-Bellman (HJB)-Näherungsfehler zu minimieren, insbesondere um damit den Gierratenfehler zu reduzieren, während die ANNSMC den Gierratenfehler direkt minimiert. Vorteilhaft ist der kombinierte Ansatz, beispielsweise aufgrund der inhärenten Robustheit der ANNSMC, robust gegenüber verschiedenen Störungen und Perturbationen. Insbesondere wenn der Fehler groß ist, beispielsweise zu Beginn einer Kurve, wird der Agent versuchen, aggressiv nach Strategien im Parameterraum zu suchen, während die Aggressivität des Agenten abnimmt, wenn der Fehler kleiner wird, beispielsweise sobald das Fahrzeug auf die Referenztrajektorie ausgerichtet ist. Voreilhaft mildert diese Strategie zusammen mit der „Totzonenmodifikation“ das Problem der Kontrollschwingungen aufgrund der hohen adaptiven Verstärkung im Rückkopplungspfad.Advantageously, an IRL agent is combined with an adaptive NN-based sliding mode control (ANNSMC) on the feedback path to minimize the yaw rate error for autonomous vehicles. The online learning load is advantageously shared between the two learning methods. Preferably, the IRL agent attempts to generate approximately optimal control strategies to minimize the Hamilton-Jacobi-Bellman (HJB) approximation error, in particular to reduce the yaw rate error, while the ANNSMC directly minimizes the yaw rate error. Advantageously, the combined approach is robust to various disturbances and perturbations, for example, due to the inherent robustness of the ANNSMC. In particular, when the error is large, for example, at the beginning of a turn, the agent will attempt to aggressively search for strategies in the parameter space, while the agent's aggressiveness decreases as the error decreases, for example, once the vehicle is aligned with the reference trajectory. This strategy, together with the “dead zone modification,” advantageously mitigates the problem of control oscillations due to the high adaptive gain in the feedback path.

Die Technik der Totzonenmodifikation ist eine Methode zum Einfrieren des Lernprozesses des Online-Anpassungsmechanismus innerhalb eines bestimmten Bereichs des Zustandsraums, d. h. wenn das Fahrzeug in diesen Bereich des Zustandsraums eintritt, werden die NN-Gewichte nicht aktualisiert und konstant gehalten. Zum Beispiel, wenn sich das Fahrzeug der gewünschten Referenztrajektorie angenähert hat. Auf diese Weise wird eine unnötige Aktualisierung der NN-Gewichte in der Nähe der Referenzbahn vermieden, so dass es nicht zu Schwingungen kommt.The dead zone modification technique is a method for freezing the learning process of the online adaptation mechanism within a specific region of the state space. This means that when the vehicle enters this region of the state space, the NN weights are not updated and remain constant. For example, when the vehicle has approached the desired reference trajectory, this avoids unnecessary updating of the NN weights near the reference trajectory, thus preventing oscillations.

Vorteilhaft ist das Verfahren zumindest teilweise, insbesondere vollständig, ein computerimplementiertes Verfahren.Advantageously, the method is at least partially, in particular completely, a computer-implemented method.

Ein Datenverarbeitungssystem, insbesondere eine Steuerungsarchitektur zur Steuerung, insbesondere zur Lenkung eines Fahrzeugs, insbesondere zum autonomen Fahren eines Fahrzeugs, mit Mitteln zur Durchführung des hierin beschriebenen Verfahrens.A data processing system, in particular a control architecture for controlling, in particular for steering, a vehicle, in particular for autonomous driving of a vehicle, with means for carrying out the method described herein.

Ein Computerprogramm, insbesondere Computerprogrammprodukt, mit Befehlen, die, wenn das Programm von einem Computer ausgeführt wird, den Computer veranlassen, das hierin beschriebene Verfahren auszuführen.A computer program, in particular a computer program product, comprising instructions which, when executed by a computer, cause the computer to carry out the method described therein.

Ein computerlesbares Medium mit Befehlen, die, wenn sie von einem Computer ausgeführt werden, den Computer veranlassen, das hierin beschriebene Verfahren auszuführen.A computer-readable medium containing instructions that, when executed by a computer, cause the computer to perform the method described herein.

Eine Verarbeitungseinrichtung kann dazu eingerichtet sein, ein hierin beschriebenes Verfahren ganz oder teilweise auszuführen. Dazu kann die Verarbeitungseinrichtung elektronisch ausgeführt sein und einen programmierbaren Mikrocomputer oder Mikrocontroller umfassen und das Verfahren kann in Form eines Computerprogrammprodukts mit Programmcodemitteln vorliegen. Das Computerprogrammprodukt kann auch auf einem computerlesbaren Datenträger abgespeichert sein. Merkmale oder Vorteile des Verfahrens können auf die Vorrichtung übertragen werden oder umgekehrt.A processing device can be configured to carry out a method described herein in whole or in part. For this purpose, the processing device can be implemented electronically and comprise a programmable microcomputer or microcontroller, and the method can be in the form of a computer program product with program code means. The computer program product can also be stored on a computer-readable data carrier. Features or advantages of the method can be transferred to the device, or vice versa.

Die Erfindung wird nun mit Bezug auf die beigefügten Figuren genauer beschrieben, in denen:

1 eine beispielhafte schematische Darstellung einer Steuerungsarchitektur,
2 beispielhafte Ergebnisse einer Steuerung eines Fahrzeugs unter Verwendung der Erfindung und des Standes der Technik,
3 weitere beispielhafte Ergebnisse einer Steuerung eines Fahrzeugs unter Verwendung der Erfindung und des Standes der Technik,
4 weitere beispielhafte Ergebnisse einer Steuerung eines Fahrzeugs unter Verwendung der Erfindung und des Standes der Technik, und
5 ein Flussdiagramm eines Verfahrens mit einem Vorwärtsregler zur Steuerung eines Fahrzeugs darstellt.

The invention will now be described in more detail with reference to the accompanying figures, in which:

1 an exemplary schematic representation of a control architecture,
2 exemplary results of controlling a vehicle using the invention and the prior art,
3 further exemplary results of controlling a vehicle using the invention and the prior art,
4 further exemplary results of controlling a vehicle using the invention and the prior art, and
5 a flowchart of a method using a feedforward controller for controlling a vehicle.

1 zeigt eine beispielhafte schematische Darstellung einer Steuerungsarchitektur 100. Im Ausführungsbeispiel ist die Steuerungsarchitektur 100 vorgesehen, um ein Datenverarbeitungssystem 105, insbesondere regelungstechnisches System, bereitzustellen, mit welchem ein manuelles Fahren eines Fahrzeugs 110, wie beispielsweise Pkw, Lkw, Bus, und dergleichen Fahrzeug 110, unterstützt und/oder ersetzt werden kann. Das Datenverarbeitungssystem 105 kann als autonomes Fahrsystem betrachtet werden, wobei autonom als teilautonom, d. h. nur den Fahrer unterstützend, oder vollständig autonom, d. h. den Fahrer ersetzend, verstanden werden kann. 1 shows an exemplary schematic representation of a control architecture 100. In the exemplary embodiment, the control architecture 100 is provided to provide a data processing system 105, in particular a control system, with which manual driving of a vehicle 110, such as a car, truck, bus, or similar vehicle 110, can be supported and/or replaced. The data processing system 105 can be considered an autonomous driving system, where autonomous can be understood as partially autonomous, ie, only supporting the driver, or fully autonomous, ie, replacing the driver.

Die Ausgabe des Datenverarbeitungssystem105 kann zur Lenkung des Fahrzeugs 110 verwendet werden, was im Ausführungsbeispiel nachfolgend erläutert wird. Die Steuerungsarchitektur 100 umfasst einen Sensor, insbesondere einen Gierratensensor, zum Messen einer Ist-Gierrate 115 des Fahrzeugs 110. Die Steuerungsarchitektur 100 umfasst einen Regler 120, insbesondere einen Gesamtregler. In den Regler 120 wird die Ist-Gierrate 115 und eine Soll-Gierrate 125, insbesondere Wunsch-Gierrate, eingespeist. Die Soll-Gierrate 125 leitet sich bevorzugt ab aus einer Soll-Bahnkrümmung 130, insbesondere Wunsch-Bahnkrümmung. Unter „Soll“ und/oder „Wunsch“ ist beispielsweise zu verstehen, dass gewünscht wird, dass diese Gierrate und/oder Bahnkrümmung für das Fahrzeug 110 eingestellt wird.The output of the data processing system 105 can be used to steer the vehicle 110, which is explained below in the exemplary embodiment. The control architecture 100 comprises a sensor, in particular a yaw rate sensor, for measuring an actual yaw rate 115 of the vehicle 110. The control architecture 100 comprises a controller 120, in particular an overall controller. The actual yaw rate 115 and a target yaw rate 125, in particular a desired yaw rate, are fed into the controller 120. The target yaw rate 125 is preferably derived from a target trajectory curvature 130, in particular a desired trajectory curvature. "Target" and/or "desired" are to be understood, for example, as meaning that this yaw rate and/or trajectory curvature is desired to be set for the vehicle 110.

Es kann zweckmäßig sein, dass die Soll-Gierrate 125 und/oder die Soll-Bahnkrümmung 130 im Regler 120 berechnet wird, beispielsweise durch Eingangsgrößen wie eine Fahrgeschwindigkeit 135 des Fahrzeugs 110, und/oder einer gewünschten zu fahrenden Bahnkrümmung 130. Der Regler 120 berechnet einen Gierratenfehler 140 durch Ermitteln der Differenz zwischen der Ist-Gierrate 115 und der Soll-Gierrate 125. Der Regler 120 gibt eine Gesamtstellgröße 145 aus, die zur Steuerung, insbesondere zur Lenkung, des Fahrzeugs 110 verwendet wird. Die Gesamtstellgröße 145 setzt sich im Ausführungsbeispiel aus drei Größen, nämlich einer ersten Stellgröße 150, insbesondere Stellgröße eines Vorwärtsreglers 165, einer zweiten Stellgröße 155, und einer dritten Stellgröße 160, insbesondere additiv, zusammen. Die zweite Stellgröße 155 und die dritte Stellgröße 160, insbesondere additiv, entspricht insbesondere der Stellgröße eines Rückkopplungsregler 166. Unter additiv kann eine Addition und/oder Subtraktion verstanden werden. Es ist vorstellbar, dass die Gesamtstellgröße 145 auch mathematisch, beispielsweise über Wichtungen, Funktionen, und/oder dergleichen, aus den drei Stellgrößen 150, 155, 160 gebildet wird. In einem weiteren Ausführungsbeispiel kann sich die Gesamtstellgröße 145 aus exakt drei, oder mehr als drei Größen zusammen setzen.It may be expedient for the target yaw rate 125 and/or the target path curvature 130 to be calculated in the controller 120, for example, using input variables such as a driving speed 135 of the vehicle 110 and/or a desired path curvature 130 to be traveled. The controller 120 calculates a yaw rate error 140 by determining the difference between the actual yaw rate 115 and the target yaw rate 125. The controller 120 outputs a total manipulated variable 145, which is used to control, in particular to steer, the vehicle 110. In the exemplary embodiment, the total manipulated variable 145 is composed of three variables, namely a first manipulated variable 150, in particular the manipulated variable of a feedforward controller 165, a second manipulated variable 155, and a third manipulated variable 160, in particular additively. The second manipulated variable 155 and the third manipulated variable 160, in particular additive, correspond in particular to the manipulated variable of a feedback controller 166. Additive can be understood as an addition and/or subtraction. It is conceivable that the total manipulated variable 145 can also be mathematically calculated, for example via weightings, functions, and/or the equal, is formed from the three manipulated variables 150, 155, 160. In a further embodiment, the total manipulated variable 145 can be composed of exactly three, or more than three variables.

Der Regler 120 umfasst den Vorwärtsregler 165 und den Rückkopplungsregler 166. Der Rückkopplungsregler 166 umfasst einen ersten Rückwärtsregler 170, auch erster Unterregler 170 genannt, und einen zweiten Rückwärtsregler, auch zweiter Unterregler 175 genannt. Der Vorwärtsregler 165 gibt die erste Stellgröße 150 aus. Der erste Rückwärtsregler 170 gibt die zweite Stellgröße 155 aus. Der zweite Rückwärtsregler 175 gibt die dritte Stellgröße 160 aus.The controller 120 includes the feedforward controller 165 and the feedback controller 166. The feedback controller 166 includes a first feedforward controller 170, also called the first sub-controller 170, and a second feedforward controller, also called the second sub-controller 175. The feedforward controller 165 outputs the first manipulated variable 150. The first feedforward controller 170 outputs the second manipulated variable 155. The second feedforward controller 175 outputs the third manipulated variable 160.

Im Ausführungsbeispiel ist der Vorwärtsregler 165 als, insbesondere kinematische, Vorsteuerung 180, in Englisch Feedforward oder Feedforward control genannt, ausgebildet. Der Vorwärtsregler 165 ist vorgesehen, um die erste Stellgröße 150 so zu berechnen, dass eine gewünschte Bahnkrümmung realisiert werden kann. Die gewünschte Bahnkrümmung entspricht im Ausführungsbeispiel der zu fahrenden Bahnkrümmung 130, insbesondere der Soll-Bahnkrümmung 130. Die zweite und dritte Stellgrößen 155, 160 werden im Ausführungsbeispiel verwendet, um die erste Stellgröße 150 so zu korrigieren, dass eine Gesamtstellgröße 145 erzeugt wird, die das Fahrzeug 110 veranlasst, der gewünschten Fahrtrajektorie, insbesondere Bahnkrümmung, präzise, schnell und komfortabel zu folgen.In the exemplary embodiment, the feedforward controller 165 is designed as a feedforward control 180, in particular a kinematic one. The feedforward controller 165 is provided to calculate the first manipulated variable 150 such that a desired path curvature can be achieved. In the exemplary embodiment, the desired path curvature corresponds to the path curvature 130 to be traveled, in particular the target path curvature 130. The second and third manipulated variables 155, 160 are used in the exemplary embodiment to correct the first manipulated variable 150 such that an overall manipulated variable 145 is generated, which causes the vehicle 110 to follow the desired travel trajectory, in particular the path curvature, precisely, quickly, and comfortably.

Eingangsgrößen für den Vorwärtsregler 165 sind beispielsweise ein Satz an Bahnkrümmungen 220, die gewünschte und/oder aktuelle Gierrate 225, sowie die Fahrgeschwindigkeit 230, insbesondere die laterale Fahrgeschwindigkeit. Hierbei können die Eingangsgrößen für den Vorwärtsregler 165 und den Rückkopplungsregler 166 gleich sein, insbesondere Fahrgeschwindigkeit 135, 230, und/oder Gierrate 225, 115.Input variables for the feedforward controller 165 include, for example, a set of path curvatures 220, the desired and/or current yaw rate 225, and the vehicle speed 230, in particular the lateral vehicle speed. The input variables for the feedforward controller 165 and the feedback controller 166 can be the same, in particular vehicle speed 135, 230, and/or yaw rate 225, 115.

Unter einem Satz an Bahnkrümmungen 220 ist beispielsweise zu verstehen, dass mehrere Bahnkrümmungen im Satz enthalten sind. Sämtliche Bahnkrümmungen im Satz eignen sich, je nach Randbedingung wie beispielsweise Fahrgeschwindigkeit 230, Gierrate 225, oder dergleichen, für das Durchfahren der gewünschten Fahrtrajektorie. Je nach Fahrgeschwindigkeit 230, Gierrate 225, oder dergleichen, ist nur eine Bahnkrümmung aus dem Satz an Bahnkrümmungen, insbesondere eine Vorschaubahnkrümmung 235 für ein präzises, schnelles und komfortables Fahren auswählbar. Hierbei sei betont, dass die Vorschaubahnkrümmung 235 auch nur abschnittsweise ausgewählt werden kann, und beispielsweise mehrere hintereinander gereihte Vorschaubahnkrümmungen schließlich das Durchfahren der Fahrtrajektorie ermöglichen. Eine solche Hintereinanderreihung kann beispielsweise dann erfolgen, wenn die Fahrgeschwindigkeit 230 und/oder die Gierrate 225 des Fahrzeugs 110 beim Durchfahren der Kurve verändert wird.A set of trajectory curvatures 220, for example, means that the set contains several trajectory curvatures. Depending on the boundary conditions such as driving speed 230, yaw rate 225, or the like, all trajectory curvatures in the set are suitable for traversing the desired driving trajectory. Depending on the driving speed 230, yaw rate 225, or the like, only one trajectory curvature can be selected from the set of trajectory curvatures, in particular a preview trajectory curvature 235 for precise, fast, and comfortable driving. It should be emphasized here that the preview trajectory curvature 235 can also be selected only in sections, and, for example, several preview trajectory curvatures arranged one after the other ultimately enable the trajectory to be traversed. Such a sequence can occur, for example, when the driving speed 230 and/or the yaw rate 225 of the vehicle 110 is changed when driving through the curve.

Die Auswahl der Vorschaubahnkrümmung 235 aus dem Satz der Bahnkrümmungen 220 erfolgt insbesondere auch aufgrund einer regelungsbasierten Zeitverzögerung. Die regelungsbasierte Zeitverzögerung ist beispielsweise die Zeit, die der Regler 120 insgesamt, und/oder der Vorwärtsregler 165, und/oder der Rückkopplungsregler 166, und/oder der erste Rückwärtsregler 170, und/oder der zweite Rückwärtsregler 175 benötigt, um die jeweilige Stellgröße 145, 150, 155, 160 zu berechnen und/oder auszugeben. Insbesondere bei der Verwendung von maschinellen Lernen in einem und/oder mehreren der Regler 120, 165, 166, 170, 175 kann eine regelungsbasierte Zeitverzögerung auftreten.The selection of the preview path curvature 235 from the set of path curvatures 220 is also carried out, in particular, based on a control-based time delay. The control-based time delay is, for example, the time required by the controller 120 as a whole, and/or the feedforward controller 165, and/or the feedback controller 166, and/or the first reverse controller 170, and/or the second reverse controller 175 to calculate and/or output the respective manipulated variable 145, 150, 155, 160. A control-based time delay may occur, particularly when using machine learning in one and/or more of the controllers 120, 165, 166, 170, 175.

Der Vorwärtsregler 165 umfasst einen maschinellen Lernalgorithmus, insbesondere ein neuronales Netz. Hierzu umfasst der Vorwärtsregler 165 ein Updategesetz 240, welches insbesondere gespeist wird mit der Fahrgeschwindigkeit 230 und/oder der gewünschten und aktuellen Gierrate 225. Der Vorwärtsregler umfasst eine radiale Basisfunktion 245, kurz RBF. Die radiale Basisfunktion 245 wird gespeist durch das Updategesetz 240. Die radiale Basisfunktion 245 schätzt die regelungstechnische Zeitverzögerung. Mit dieser Information der regelungstechnischen Zeitverzögerung wird aus dem Satz der Bahnkrümmungen 220 die Vorschaubahnkrümmung ausgewählt. Die Vorschaubahnkrümmung in die Vorsteuerung 180 eingelesen, um dort als Eingangsgröße zur Berechnung der Stellgröße 150 zu dienen.The feedforward controller 165 comprises a machine learning algorithm, in particular a neural network. For this purpose, the feedforward controller 165 comprises an update law 240, which is fed in particular with the vehicle speed 230 and/or the desired and current yaw rate 225. The feedforward controller comprises a radial basis function 245, or RBF for short. The radial basis function 245 is fed by the update law 240. The radial basis function 245 estimates the control-related time delay. Using this control-related time delay information, the preview path curvature is selected from the set of path curvatures 220. The preview path curvature is read into the feedforward control 180 to serve as the input variable for calculating the manipulated variable 150.

Im Ausführungsbeispiel erzeugt der erste Rückwärtsregler 170 die zweite Stellgröße 155 unter Verwendung eines maschinellen Lernalgorithmus. Der maschinelle Lernalgorithmus umfasst einen Online-Algorithmus. Darunter kann beispielsweise verstanden werden, dass der Online-Algorithmus nicht oder nicht vollständig vorher trainiert werden muss. Sich verändernde Einflussgrößen, wie beispielsweise Fahrzeuggewicht, Reibung zwischen Rädern und Fahrbahn, und dergleichen, können simultan einfließen, insbesondere ohne Vorabtraining. Der maschinelle Lernalgorithmus ist vorzugsweise stabil, insbesondere umfasst er ein stabiles Updategesetz. Unter stabil kann beispielsweise verstanden werden, dass der Gierratenfehler und/oder eine Regel- und/oder Steuergröße begrenzt ist.In the exemplary embodiment, the first feedback controller 170 generates the second manipulated variable 155 using a machine learning algorithm. The machine learning algorithm comprises an online algorithm. This can be understood, for example, to mean that the online algorithm does not need to be trained in advance, or does not need to be trained completely beforehand. Changing influencing variables, such as vehicle weight, friction between wheels and road surface, and the like, can be incorporated simultaneously, in particular without prior training. The machine learning algorithm is preferably stable; in particular, it comprises a stable update law. Stable can be understood, for example, to mean that the yaw rate error and/or a control and/or open-loop variable is limited.

Im Ausführungsbeispiel umfasst der erste Rückwärtsregler 170 ein, insbesondere kritisches, neuronales Netz 185, ein, insbesondere online, Updategesetz 190, optional einen Gradientenoperator 195 und einen IRL-Agenten 200. Unter „IRL“ ist „Intergral Reinforcement Learning“, also ein integrales und/oder verstärkendes und/oder bestärkendes Verstärkungslernen zu verstehen. Das neuronale Netz 185 umfasst insbesondere eine radiale Basisfunktion, auch RBF genannt. Im Ausführungsbeispiel speist das Updategesetz 190 das neuronale Netz 185. Die Ausgabe aus dem neuronale Netz 190, insbesondere die weiter zu verarbeitende Steuergröße, wird mit einem Gradientenoperator 195 optional, beispielsweise unter Verwendung der zeitlichen Ableitung der Fahrgeschwindigkeit, überarbeitet und/oder korrigiert. Die Ausgabe aus dem Gradientenoperator 195 fließt in den IRL-Agenten 200, welcher schließlich die zweite Steuergröße 155 berechnet und ausgibt. Der IRL-Agent 200 nutzt eine variable Lernrate, die insbesondere effizient und schnell ist, wobei im Ausführungsbeispiel beispielhaft die variabel Lernrate eine Funktion von Fahrgeschwindigkeit 135 und/oder Gierratenfehler 140 sein kann. Gelernt werden kann aber auch anhand Gierratenfehler 140 und/oder dessen erste zeitliche Ableitung, Soll-Gierrate 125 und/oder dessen erste zeitliche Ableitung, und/oder Fahrgeschwindigkeit 135. In the exemplary embodiment, the first feedback controller 170 comprises a, in particular critical, neural network 185, an, in particular online, update law 190, optionally a gradient operator 195, and an IRL agent 200. "IRL" refers to "integral reinforcement learning," i.e., integral and/or amplifying and/or strengthening reinforcement learning. The neural network 185 comprises, in particular, a radial basis function, also called an RBF. In the exemplary embodiment, the update law 190 feeds the neural network 185. The output from the neural network 190, in particular the control variable to be further processed, is optionally revised and/or corrected using a gradient operator 195, for example, using the time derivative of the vehicle speed. The output from the gradient operator 195 flows into the IRL agent 200, which ultimately calculates and outputs the second control variable 155. The IRL agent 200 uses a variable learning rate, which is particularly efficient and fast. In the exemplary embodiment, the variable learning rate can be a function of vehicle speed 135 and/or yaw rate error 140. However, learning can also be based on yaw rate error 140 and/or its first time derivative, target yaw rate 125 and/or its first time derivative, and/or vehicle speed 135.

Eingangsgrößen für den ersten Rückwärtsregler 170 sind insbesondere die Fahrgeschwindigkeit 135, der Gierratenfehler und/oder die Ist-Gierrate 115.Input variables for the first reverse controller 170 are in particular the driving speed 135, the yaw rate error and/or the actual yaw rate 115.

Im Ausführungsbeispiel erzeugt der zweite Rückwärtsregler 175 die dritte Stellgröße 160 unter Verwendung eines maschinellen Lernalgorithmus. Der maschinelle Lernalgorithmus umfasst einen Online-Algorithmus 205. Darunter kann beispielsweise verstanden werden, dass der Online-Algorithmus 205 nicht oder nicht vollständig vorher trainiert werden muss. Sich verändernde Einflussgrößen, wie beispielsweise Fahrzeuggewicht, Reibung zwischen Rädern und Fahrbahn, und dergleichen, können simultan einfließen, insbesondere ohne Vorabtraining. Der maschinelle Lernalgorithmus ist vorzugsweise stabil, insbesondere umfasst ein stabiles Updategesetz. Unter stabil kann beispielsweise verstanden werden, dass der Gierratenfehler und/oder eine Regel- und/oder Steuergröße begrenzt ist.In the exemplary embodiment, the second feedback controller 175 generates the third manipulated variable 160 using a machine learning algorithm. The machine learning algorithm comprises an online algorithm 205. This can be understood, for example, to mean that the online algorithm 205 does not need to be trained in advance, or does not need to be trained completely beforehand. Changing influencing variables, such as vehicle weight, friction between the wheels and the road surface, and the like, can be incorporated simultaneously, in particular without prior training. The machine learning algorithm is preferably stable, in particular comprising a stable update law. Stable can be understood, for example, to mean that the yaw rate error and/or a control and/or open-loop variable is limited.

Im Ausführungsbeispiel umfasst der zweite Rückwärtsregler 175 ein neuronales Netz 210, ein, insbesondere ein, insbesondere direkt, adaptives neuronales Netz 210. Im Ausführungsbeispiel speist der Online-Algorithmus 205 das neuronale Netz 210. Es kann zweckmäßig sein, wenn der Online-Algorithmus 205 ein radiales Basisfunktionsnetzwerk 215 speist, welches eine Steuergröße an das neuronale Netz 210 ausgibt. Es kann auch zweckmäßig sein, dass der Online-Lernalgorithmus direkt das neuronale Netz 210 speist. Die Ausgabe aus dem neuronale Netz 210 ist die dritte Steuergröße 160. Das neuronale Netz 210 nutzt eine variable Lernrate, die insbesondere effizient und schnell ist, wobei im Ausführungsbeispiel beispielhaft die variabel Lernrate eine Funktion von Fahrgeschwindigkeit 135 und Gierratenfehler 140 sein kann. Gelernt werden kann aber auch anhand Gierratenfehler 140 und/oder dessen erste zeitliche Ableitung, Soll-Gierrate 125 und/oder dessen erste zeitliche Ableitung, und/oder Fahrgeschwindigkeit 135.In the exemplary embodiment, the second feedback controller 175 comprises a neural network 210, in particular a directly adaptive neural network 210. In the exemplary embodiment, the online algorithm 205 feeds the neural network 210. It may be expedient for the online algorithm 205 to feed a radial basis function network 215, which outputs a control variable to the neural network 210. It may also be expedient for the online learning algorithm to feed the neural network 210 directly. The output from the neural network 210 is the third control variable 160. The neural network 210 uses a variable learning rate, which is particularly efficient and fast, wherein in the exemplary embodiment, the variable learning rate can, for example, be a function of the driving speed 135 and the yaw rate error 140. However, learning can also be carried out based on yaw rate error 140 and/or its first time derivative, target yaw rate 125 and/or its first time derivative, and/or driving speed 135.

Eingangsgrößen für den zweiten Rückwärtsregler 175 sind insbesondere die Fahrgeschwindigkeit 135, der Gierratenfehler 140 und/oder die Ist-Gierrate 115.Input variables for the second reverse controller 175 are in particular the vehicle speed 135, the yaw rate error 140 and/or the actual yaw rate 115.

1 zeigt das Datenverarbeitungssystem 100, insbesondere eine Steuerungsarchitektur 105 zur Steuerung, insbesondere zur Lenkung eines Fahrzeugs 110, insbesondere zum autonomen Fahren eines Fahrzeugs 110, mit Mitteln zur Durchführung des hierin beschriebenen Verfahrens. Das Ausführungsbeispiel umfasst ein in den Figuren nicht dargestelltes Computerprogramm, insbesondere ein in den Figuren nicht dargestelltes Computerprogrammprodukt, mit Befehlen, die, wenn das Programm von einem in den Figuren nicht dargestelltem Computer ausgeführt wird, den Computer veranlassen, das hierin beschriebene Verfahren auszuführen. Das Ausführungsbeispiel umfasst eine in den Figuren nicht dargestelltes computerlesbares Medium mit Befehlen, die, wenn sie von dem Computer ausgeführt werden, den Computer veranlassen, das hierin beschriebene Verfahren auszuführen. 1 shows the data processing system 100, in particular a control architecture 105 for controlling, in particular for steering, a vehicle 110, in particular for autonomous driving of a vehicle 110, with means for carrying out the method described herein. The exemplary embodiment comprises a computer program (not shown in the figures), in particular a computer program product (not shown in the figures), with instructions which, when the program is executed by a computer (not shown in the figures), cause the computer to carry out the method described herein. The exemplary embodiment comprises a computer-readable medium (not shown in the figures) with instructions which, when executed by the computer, cause the computer to carry out the method described herein.

2 zeigt das Ergebnis der Regelung unter Verwendung eines hierin beschriebenen Verfahrens in einer ersten beispielhaften Kurvenfahrt. 2 links oben zeigt eine Querabweichung in einem Längenmaß aufgetragen über die Fahrzeit, 2 oben rechts zeigt einen Kursfehlerwinkel in einem Gradmaß aufgetragen über die Fahrzeit, 2 unten links zeigt eine gefahrene Position in einem Längenmaß, aufgetragen in X- und Y-Richtung, 2 unten rechts zeigte einen aktuellen Radwinkel in einem Gradmaß aufgetragen über die Fahrzeit. In 2 ist das hierin beschriebene Verfahren 300 sowie zwei Verfahren 305, 310 nach dem Stand der Technik gezeigt. 2 ist gut zu entnehmen, dass das hierin beschriebene Verfahren 300 eine schnellere und genauere Regelung bei höherem Komfort ermöglicht im Vergleich zu den beiden Verfahren 305, 310 nach dem Stand der Technik. 2 shows the result of the control using a method described herein in a first exemplary cornering. 2 top left shows a transverse deviation in a length dimension plotted over the travel time, 2 top right shows a course error angle in degrees plotted against the travel time, 2 bottom left shows a driven position in a length dimension, plotted in X and Y directions, 2 bottom right showed a current wheel angle in degrees plotted over the driving time. In 2 the method 300 described herein as well as two prior art methods 305, 310 are shown. 2 It can be clearly seen that the method 300 described herein enables faster and more precise control with greater comfort compared to the two methods 305, 310 according to the prior art.

3 zeigt das Ergebnis der Regelung unter Verwendung eines hierin beschriebenen Verfahrens in einer zweiten beispielhaften Kurvenfahrt. 3 links oben zeigt eine Querabweichung in einem Längenmaß aufgetragen über die Fahrzeit, 3 oben rechts zeigt einen Kursfehlerwinkel in einem Gradmaß aufgetragen über die Fahrzeit, 3 unten links zeigt eine gefahrene Position in einem Längenmaß, aufgetragen in X- und Y-Richtung, 3 unten rechts zeigte einen aktuellen Radwinkel in einem Gradmaß aufgetragen über die Fahrzeit. In 3 ist das hierin beschriebene Verfahren 400 sowie zwei Verfahren 405, 410 nach dem Stand der Technik gezeigt. 3 ist gut zu entnehmen, dass das hierin beschriebene Verfahren 400 eine schnellere und genauere Regelung bei höherem Komfort ermöglicht im Vergleich zu den beiden Verfahren 405, 410 nach dem Stand der Technik. 3 shows the result of the control using a method described herein in a second exemplary cornering. 3 top left shows a transverse deviation in a length dimension plotted over the travel time, 3 top right shows a course error angle in degrees plotted against the travel time, 3 bottom left shows a driven position in a length dimension, plotted in X and Y directions, 3 bottom right showed a current wheel angle in degrees plotted over the driving time. In 3 the method 400 described herein as well as two prior art methods 405, 410 are shown. 3 It can be clearly seen that the method 400 described herein enables faster and more precise control with greater comfort compared to the two methods 405, 410 according to the prior art.

4 zeigt das Ergebnis der Regelung unter Verwendung eines hierin beschriebenen Verfahrens in einer dritten beispielhaften Kurvenfahrt. 4 links oben zeigt eine Querabweichung in einem Längenmaß aufgetragen über die Fahrzeit, 4 oben rechts zeigt einen Kursfehlerwinkel in einem Gradmaß aufgetragen über die Fahrzeit, 4 unten links zeigt eine gefahrene Position in einem Längenmaß, aufgetragen in X- und Y-Richtung, 4 unten rechts zeigte einen aktuellen Radwinkel in einem Gradmaß aufgetragen über die Fahrzeit. In 4 ist das hierin beschriebene Verfahren 500 sowie zwei Verfahren 505, 510 nach dem Stand der Technik gezeigt. 4 ist gut zu entnehmen, dass das hierin beschriebene Verfahren 500 eine schnellere und genauere Regelung bei höherem Komfort ermöglicht im Vergleich zu den beiden Verfahren 505, 510 nach dem Stand der Technik. 4 shows the result of the control using a method described herein in a third exemplary cornering. 4 top left shows a transverse deviation in a length dimension plotted over the travel time, 4 top right shows a course error angle in degrees plotted against the travel time, 4 bottom left shows a driven position in a length dimension, plotted in X and Y directions, 4 bottom right showed a current wheel angle in degrees plotted over the driving time. In 4 the method 500 described herein as well as two prior art methods 505, 510 are shown. 4 It can be clearly seen that the method 500 described herein enables faster and more precise control with greater comfort compared to the two prior art methods 505, 510.

5 zeigt beispielhaft das Verfahren 600 mit einem Vorwärtsregler 165 zur Steuerung, insbesondere Lenkung, eines Fahrzeugs 110, insbesondere zum autonomen Fahren, wobei das Verfahren 600 folgende Schritte umfasst: Eingabe 605 eines Satzes an Bahnkrümmungen 220, die jeweils eine vom Fahrzeug 110 zu fahrende Bahn beschreiben, in den Vorwärtsregler 165; Abschätzung 610 einer regelungsbasierten Zeitverzögerung durch den Vorwärtsregler 165; Auswahl 615 einer Vorschaubahnkrümmung 235 aus dem Satz an Bahnkrümmungen 220 basierend auf der abgeschätzten regelungsbasierten Zeitverzögerung und eines Fahrparameters, beispielsweise Fahrgeschwindigkeit 230 und/oder Gierrate 225 des Fahrzeugs 110; Ableitung 620 einer Stellgröße des Vorwärtsreglers 165 aus der Vorschaubahnkrümmung 235 und Ausgabe 625 der Stellgröße des Vorwärtsreglers 165, wobei die Stellgröße des Vorwärtsreglers 165 zur Steuerung des Fahrzeugs 110 verwendet wird. 5 shows, by way of example, the method 600 with a feedforward controller 165 for controlling, in particular steering, a vehicle 110, in particular for autonomous driving, wherein the method 600 comprises the following steps: input 605 of a set of path curvatures 220, each describing a path to be traveled by the vehicle 110, into the feedforward controller 165; estimation 610 of a control-based time delay by the feedforward controller 165; selection 615 of a preview path curvature 235 from the set of path curvatures 220 based on the estimated control-based time delay and a driving parameter, for example, driving speed 230 and/or yaw rate 225 of the vehicle 110; Derivation 620 of a control variable of the feedforward controller 165 from the preview path curvature 235 and output 625 of the control variable of the feedforward controller 165, wherein the control variable of the feedforward controller 165 is used to control the vehicle 110.

BezugszeichenReference symbol

100100: SteuerungsarchitekturControl architecture
105105: Datenverarbeitungssystemdata processing system
110110: Fahrzeugvehicle
115115: Ist-GierrateActual yaw rate
120120: ReglerController
125125: Soll-GierrateTarget yaw rate
130130: Soll-BahnkrümmungTarget path curvature
135135: FahrgeschwindigkeitDriving speed
140140: GierratenfehlerYaw rate error
145145: GesamtstellgrößeTotal control variable
150150: erste Stellgrößefirst manipulated variable
155155: zweite Stellgrößesecond manipulated variable
160160: dritte Stellgrößethird manipulated variable
165165: VorwärtsreglerFeedforward controller
166166: RückkopplungsreglerFeedback controller
170170: erster Rückwärtsreglerfirst reverse regulator
175175: zweiter Rückwärtsreglersecond reverse regulator
180180: VorsteuerungFeedforward control
185185: Neuronales NetzNeural network
190190: Update LawUpdate Law
195195: GradientenoperatorGradient operator
200200: IRL AgentIRL Agent
205205: Online-AlgorithmusOnline algorithm
210210: Neuronales NetzNeural network
215215: Radiales BasisfunktionsnetzwerkRadial basis function network
220220: Satz an BahnkrümmungenSet of orbit curvatures
225225: GierrateYaw rate
230230: FahrgeschwindigkeitDriving speed
235235: VorschaubahnkrümmungPreview path curvature
240240: UpdategesetzUpdate Act
245245: radiale Basisfunktionradial basis function
300300: Hierin beschriebenes VerfahrenThe method described herein
305305: Verfahren nach Stand der TechnikState-of-the-art procedures
310310: Verfahren nach Stand der TechnikState-of-the-art procedures
400400: Hierin beschriebenes VerfahrenThe method described herein
405405: Verfahren nach Stand der TechnikState-of-the-art procedures
410410: Verfahren nach Stand der TechnikState-of-the-art procedures
500500: Hierin beschriebenes VerfahrenThe method described herein
505505: Verfahren nach Stand der TechnikState-of-the-art procedures
510510: Verfahren nach Stand der TechnikState-of-the-art procedures
600600: VerfahrenProceedings
605605: Eingabeinput
610610: AbschätzungEstimation
615615: AuswahlSelection
620620: AbleitungDerivation
625625: Ausgabeoutput

Claims

Method (600) with a feedforward controller (165) for controlling, in particular steering, a vehicle (110), in particular for autonomous driving, wherein the method (600) comprises the following steps: input (605) of a set of path curvatures (220), each describing a path to be traveled by the vehicle (110), into the feedforward controller (165); estimation (610) of a control-based time delay by the feedforward controller (165); selection (615) of a preview path curvature (235) from the set of path curvatures (220) based on the estimated control-based time delay and a driving parameter of the vehicle (110); Derivation (620) of a control variable of the feedforward controller (165) from the preview path curvature (235) and output (625) of the control variable of the feedforward controller (165), wherein the control variable of the feedforward controller (165) is used to control the vehicle (110).

Procedure (600) according to Claim 1 , wherein the control-based time delay is estimated using a machine learning algorithm (240, 245), in particular a neural network, and in particular wherein the driving parameter (230, 225) is used as an input variable for the machine learning algorithm (240, 245), in particular the neural network.

Procedure (600) according to Claim 2 , wherein the machine learning algorithm (240, 245), in particular the neural network, comprises a radial basis function (245).

Procedure (600) according to Claim 2 or 3 , wherein the machine learning algorithm (240, 245), in particular the neural network, comprises an update law (240).

Method (600) according to one of the preceding claims, comprising a feedback controller (166), wherein an actual yaw rate (115) of the vehicle (110) is measured, in particular with a yaw rate sensor; wherein a yaw rate error (140) is determined from the difference between the actual yaw rate (115) and a desired yaw rate (125), wherein the yaw rate error (145) is processed in the feedback controller (166); wherein the feedback controller (166) outputs a manipulated variable of the feedback controller (166), which, in particular together with the manipulated variable of the feedforward controller (165), is used for controlling, in particular for steering, the vehicle (110); and wherein the feedforward controller (165) at least partially, in particular completely, takes into account and/or estimates the control time of the feedback controller (166) when estimating the control-based time delay.

Procedure (600) according to Claim 5 , wherein the manipulated variable of the feedback controller (166) is generated using a machine learning algorithm.

Procedure (600) according to Claim 6 , where the machine learning algorithm includes an online learning algorithm.

Procedure (600) according to Claim 6 or 7 , where the machine learning algorithm includes a variable learning rate.

Method (600) according to one of the Claims 5 until 8 , wherein the feedback controller comprises a first feedback controller (170) and a second feedback controller (175).

Procedure (600) according to Claim 9 , wherein the first feedback controller (170) comprises a neural network (185), in particular wherein the first feedback controller (170) comprises an integral gain learning algorithm (200), in particular wherein the yaw rate error (140) is an input variable for the integral gain learning algorithm (200).

Procedure (600) according to Claim 9 or 10 , wherein the second feedback controller (170) comprises a directly adaptive neural network (210).

Method (600) according to one of the Claims 1 until 11 , wherein the method (600) is at least partially, in particular completely, a computer-implemented method.

Data processing system, in particular a control architecture (100) for controlling, in particular for steering, a vehicle (110), in particular for autonomous driving of a vehicle (110), with means for carrying out the method (600) according to one of the Claims 1 until 12 .

Computer program, in particular computer program product, with instructions which, when the program is executed by a computer, cause the computer to carry out the method (600) according to one of the Claims 1 until 12 to execute.

A computer-readable medium having instructions which, when executed by a computer, cause the computer to perform the method (600) according to any one of Claims 1 until 12 to execute.