Metrics

Metrics for Evaluating Vehicle Performance

We motivate the notion of metrics for evaluating vehicle performance by first defining a notion of risk. Let \(P_0\) denote the base distribution that models standard traffic behavior and \(X \sim P_0\) be a realization of the simulation (e.g. weather conditions and driving policies of other agents). For a continuous objective function \(f: \mathcal{X} \to \mathbb{R}\) that measures "safety"- so that low values of \(f(x)\) correspond to dangerous scenarios- our goal is to evaluate the probability of a dangerous event

\[\begin{equation} p_{\gamma} := \mathbb{P}_{0}(f(X) < \gamma), \end{equation}\]

for some threshold \(\gamma\). Examples of this objective function \(f\) include simple performance measures such as the (negative) maximum magnitude acceleration or jerk during a simulation. More sophisticated measures include the minimum distance to any other object during a simulation or the minimum time-to-collision (TTC) during a simulation. The objective can even target specific aspects of system performance. For example, to stress-test a perception module, the objective can include the (negative) maximum norm of off-diagonal elements of the perception system's confusion matrix.

Time-to-collision

Let \(T_i(t)\) be the instantaneous time-to-collision between the ego vehicle and the \(i\)-th environment vehicle at time step \(t\). For a given simulation rollout \(X\), the TTC metric is given by

\[\begin{equation} f(X):=\min_{t} \left ( \min_{i}T_i(t) \right ). \end{equation}\]

The value \(T_i(t)\) can be defined in multiple ways (see e.g. https://ieeexplore.ieee.org/document/8500709). We define it as the amount of time that would elapse before the two vehicles' bounding boxes intersect assuming that they travel at constant fixed velocities from the snapshot at time \(t\). Time-to-collision captures directly whether or not the ego-vehicle was involved in a crash. If it is positive no crash occurred, and if it is 0 or negative there was a collision.

Incorporating blame

The goal of this section is to describe different ways to measure the performance of an autonomous vehicle within a scenario. Particular weight is given to mechanisms which encode (1) "fault" or (2) map binary measures to a continuous metric. We discuss the use of "responsibility-sensitive safety" (RSS) or similar fault-based systems, how hyper-parameters of such methods are defined, and ways in which current methodology is unsatisfactory. A key question is whether "blame" will be applied post-hoc to tests or as a search metric itself.

As anyone who has ever recieved a ticket for speeding or rolling through a stop sign can assert, it is possible to violate regional traffic laws and conventions without crashing. In particular, laws such as stopping or turning from the proper lane are more naturally expressed as true or false than typical highway regulations regarding maintaining a safe speed of travel. The Vienna Traffic Convention is one source of such rules, although the presentation is in natural language and insufficient to be formally evaluated. Several researchers have proposed encoding the rules in higher-order logic, temporal logic, and other formalisms (e.g. https://ieeexplore.ieee.org/document/7313361). The logical encoding is not particularly interesting or relevant, but the behavior of the simulation when such rules are violated is important.

Traffic laws in urban driving can be handled in multiple ways Violation of traffic rules could be considered an exit condition for the simulator, a component of the objective function returned by the simulator to the TrustworthySearch API, or a means of filtering executions after a set of simulation runs has been completed.

Responsibility in the event of an accident

Beyond simple moving violations there is still more to say about crashes (instances where \(\text{TTC}\leq 0\)); when a crash occurs between two or more human driven vehicles, law enforcement generally assesses which drivers are responsible for causing the accident. While laws vary between jurisdictions the majority of laws can be derived from a few invariant "common-sense" rules (https://arxiv.org/abs/1708.06374):

Do not hit someone from behind
Do not cut-in recklessly
Right-of-way is given not taken
Use caution in areas of limited visibility
If you can avoid an accident without causing another one, you must do it

The arguments presented in the RSS framework of https://arxiv.org/abs/1708.06374 (and the nearly identical presentation of https://www.nvidia.com/content/dam/en-zz/Solutions/self-driving-cars/safety-force-field/an-introduction-to-the-safety-force-field-v2.pdf) attempt to formalize the common-sense paradigm above. However, the ideas are hardly unique to the RSS framework (see the 1938 article https://www.jstor.org/stable/1416145). In particular, RSS and its variants describe notions of a safe-following distance as well as "proper" responses to dangerous situations. RSS can be useful for filtering or guiding test case generation.