-
DeepInteraction 3D Object Detection via Modality Interaction
Existing top-performance 3D object detectors typically rely on the multi-modal fusion strategy. This design is however fundamentally restricted due to overlooking the modality-specific useful information and finally hampering the model performance. To address this limitation, in this work we introduce a novel modality interaction strategy where individual per-modality representations are learned and maintained throughout for enabling their unique characteristics to be exploited during object detection. To realize this proposed strategy, we design a DeepInteraction architecture characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.
-
BEVFusion A Simple and Robust LiDAR-Camera Fusion Framework
Fusing the camera and LiDAR information has become a de-facto standard for 3D object detection tasks. Current methods rely on point clouds from the LiDAR sensor as queries to leverage the feature from the image space. However, people discovered that this underlying assumption makes the current fusion framework infeasible to produce any prediction when there is a LiDAR malfunction, regardless of minor or major. This fundamentally limits the deployment capability to realistic autonomous driving scenarios. In contrast, we propose a surprisingly simple yet novel fusion framework, dubbed BEVFusion, whose camera stream does not depend on the input of LiDAR data, thus addressing the downside of previous methods.
-
Where2comm Communication-Efficient Collaborative Perception via Spatial Confidence Maps
Multi-agent collaborative perception inevitably results in a fundamental trade-off between perception performance and communication bandwidth. To tackle this bottleneck issue, we propose a spatial confidence map, which reflects the spatial heterogeneity of perceptual information. It empowers agents to only share spatially sparse, yet perceptually critical information, contributing to where to communicate. Based on this novel spatial confidence map, we propose Where2comm, a communication-efficient collaborative perception framework.
-
V2X-ViT Vehicle-to-Everything Cooperative Perception with Vision Transformer
In this paper, we investigate the application of Vehicle-toEverything (V2X) communication to improve the perception performance of autonomous vehicles. We present a robust cooperative perception framework with V2X communication using a novel vision Transformer. Specifically, we build a holistic attention model, namely V2X-ViT, to effectively fuse information across on-road agents (i.e., vehicles and infrastructure).
-
COOPERNAUT End-to-End Driving with Cooperative Perception for Networked Vehicles
COOPERNAUT, an end-to-end cooperative driving model for networked vehicles based on Point Transformer. COOPERNAUT learns to fuse encoded LiDAR information shared by nearby vehicles under realistic V2V channel capacity.
-
Bridging the Domain Gap for Multi-Agent Perception
Existing multi-agent perception algorithms assume all agents have identical neural networks, which might not be practical in the real world. The transmitted features can have a large domain gap when the models differ, leading to a dramatic performance drop in multi-agent perception.
-
Model-Agnostic Multi-Agent Perception Framework
Existing multi-agent perception systems assume that every agent utilizes the same model with identical parameters and architecture. The performance can be degraded with different perception models due to the mismatch in their confidence scores. In this work, we propose a model-agnostic multi-agent perception framework to reduce the negative effect caused by the model discrepancies without sharing the model information.
-
F-Cooper Feature based Cooperative Perception for Autonomous Vehicle Edge Computing System Using 3D Point Clouds
Propose F-Cooper, a point cloud feature based cooperative perception framework for connected autonomous vehicles to achieve a better object detection precision.
-
Cooper Cooperative Perception for Connected Autonomous Vehicles based on 3D Point Clouds
The first to conduct a study on raw-data level cooperative perception for enhancing the detection ability of self-driving systems. In this work, relying on LiDAR 3D point clouds, we fuse the sensor data collected from different positions and angles of connected vehicles.
-
A Survey of Autonomous Driving Common Practices and Emerging Technologies
This paper discusses unsolved problems and surveys the technical aspect of automated driving. Studies regarding present challenges, high-level system architectures, emerging methodologies and core functions including localization, mapping, perception, planning, and human machine interfaces, were thoroughly reviewed
-
Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization
Propoased FedNova, a general framework to analyze the convergence of federated heterogeneous optimization algorithms.
-
OPV2V An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication
An benchmark dataset for perception with vehicle-to-vehicle communication