Why Overfitting Data is a Handicap for ADAS Deployment

Why Overfitting Data is a Handicap for ADAS Deployment

ADAS enabled vehicles are most commonly driven on highways and main roads for a number of reasons. One reason is that these types of roads tend to have less complex driving scenarios compared to residential areas or side roads. Highways and main roads often have fewer intersections and more predictable traffic patterns, which can make it easier for the vehicle to navigate and make decisions.

Another reason is that vehicles on highways and main roads can be more efficient and cost-effective. These roads are typically better maintained and have fewer obstacles, which can reduce the need for sensors and other hardware that may be required to navigate more challenging environments. In addition, vehicles can take advantage of dedicated lanes and other infrastructure that may not be available on side roads or in neighborhoods.

Finally, operating vehicles autonomously with human supervision on highways and main roads may be seen as a safer and more controlled environment, as these roads are typically less crowded, have fewer pedestrians or other potential hazards, and have HD map coverage. This can help to minimize the risk of accidents or other incidents and make it easier to test and evaluate the performance of the autonomous vehicle.

However, this can lead to the autonomous driving machine learning models being overfit to these particular types of roads and may not generalize well to other, less common environments such as side roads and neighborhoods. This can cause problems when the autonomous vehicle encounters edge cases that it has not seen before, such as pedestrians, bicycles, or animals, which may be more common in residential areas. When this happens, the system does not know what to do and will likely disengage, in the best case scenario. In fact, the roads most data comes from is inversely related to where edge cases occur. 

To address this issue, it is important to collect and use a diverse set of data that represents the full range of operating conditions that the vehicle is expected to encounter. This can include data from a variety of roads, weather conditions, and traffic situations. By using a diverse and representative dataset, it is possible to train a model that can handle a wide range of driving scenarios and perform well in a variety of environments. We already outlined why most data comes from highways and main roads so now we will lay out new approaches to addressing this problem. 

Integrating maps and localization into the model training loop can expand coverage by providing additional context and information about the environment in which the model is operating. Localization refers to the process of determining the position and orientation of an object or device in an environment. By incorporating localization into the model training loop, it is possible to collect data beyond the borders of a pre-existing map, providing the system additional information about the location and orientation of objects and events in the environment. This helps the model to better understand the context of the new scenarios and begin to train on data it could not previously learn from. 


Leave a Reply

Your email address will not be published. Required fields are marked *