Author Description

Login to generate an author description

Ask a Question About This Mathematician

Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent … Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent development of radar technologies and open-sourced annotated data sets, semantic segmentation with radar signals has become very promising. However, existing methods are either computationally expensive or discard significant amounts of valuable information from raw 3D radar signals by reducing them to 2D planes via averaging. In this work, we introduce ERASE-Net, an Efficient RAdar SEgmentation Network to segment the raw radar signals semantically. The core of our approach is the novel detect-then-segment method for raw radar signals. It first detects the center point of each object, then extracts a compact radar signal representation, and finally performs semantic segmentation. We show that our method can achieve superior performance on radar semantic segmentation task compared to the state-of-the-art (SOTA) technique. Furthermore, our approach requires up to 20×less computational resources. Finally, we show that the proposed ERASE-Net can be compressed by 40% without significant loss in performance, significantly more than the SOTA network, which makes it a more promising candidate for practical automotive applications.
Precise detection of tiny objects in remote sensing imagery remains a significant challenge due to their limited visual information and frequent occurrence within scenes. This challenge is further exacerbated by … Precise detection of tiny objects in remote sensing imagery remains a significant challenge due to their limited visual information and frequent occurrence within scenes. This challenge is further exacerbated by the practical burden and inherent errors associated with manual annotation: annotating tiny objects is laborious and prone to errors (i.e., label noise). Training detectors for such objects using noisy labels often leads to suboptimal performance, with networks tending to overfit on noisy labels. In this study, we address the intricate issue of tiny object detection under noisy label supervision. We systematically investigate the impact of various types of noise on network training, revealing the vulnerability of object detectors to class shifts and inaccurate bounding boxes for tiny objects. To mitigate these challenges, we propose a DeNoising Tiny Object Detector (DN-TOD), which incorporates a Class-aware Label Correction (CLC) scheme to address class shifts and a Trend-guided Learning Strategy (TLS) to handle bounding box noise. CLC mitigates inaccurate class supervision by identifying and filtering out class-shifted positive samples, while TLS reduces noisy box-induced erroneous supervision through sample reweighting and bounding box regeneration. Additionally, Our method can be seamlessly integrated into both one-stage and two-stage object detection pipelines. Comprehensive experiments conducted on synthetic (i.e., noisy AI-TOD-v2.0 and DOTA-v2.0) and real-world (i.e., AI-TOD) noisy datasets demonstrate the robustness of DN-TOD under various types of label noise. Notably, when applied to the strong baseline RFLA, DN-TOD exhibits a noteworthy performance improvement of 4.9 points under 40% mixed noise. Datasets, codes, and models will be made publicly available.
We close three open problems in the separation complexity of valid inequalities for the knapsack polytope. Specifically, we establish that the separation problems for extended cover inequalities, (1,k)-configuration inequalities, and … We close three open problems in the separation complexity of valid inequalities for the knapsack polytope. Specifically, we establish that the separation problems for extended cover inequalities, (1,k)-configuration inequalities, and weight inequalities are all NP-complete. We also give a number of special cases where the separation problem can be solved in polynomial time.
We study relaxations for linear programs with complementarity constraints, especially instances whose complementary pairs of variables are not independent. Our formulation is based on identifying vertex covers of the conflict … We study relaxations for linear programs with complementarity constraints, especially instances whose complementary pairs of variables are not independent. Our formulation is based on identifying vertex covers of the conflict graph of the instance and generalizes the extended reformulation-linearization technique of Nguyen, Richard, and Tawarmalani to instances with general complementarity conditions between variables. We demonstrate how to obtain strong cutting planes for our formulation from both the stable set polytope and the boolean quadric polytope associated with a complete bipartite graph. Through an extensive computational study for three types of practical problems, we assess the performance of our proposed linear relaxation and new cutting-planes in terms of the optimality gap closed.
Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent … Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent development of radar technologies and open-sourced annotated data sets, semantic segmentation with radar signals has become very promising. However, existing methods are either computationally expensive or discard significant amounts of valuable information from raw 3D radar signals by reducing them to 2D planes via averaging. In this work, we introduce ERASE-Net, an Efficient RAdar SEgmentation Network to segment the raw radar signals semantically. The core of our approach is the novel detect-then-segment method for raw radar signals. It first detects the center point of each object, then extracts a compact radar signal representation, and finally performs semantic segmentation. We show that our method can achieve superior performance on radar semantic segmentation task compared to the state-of-the-art (SOTA) technique. Furthermore, our approach requires up to 20x less computational resources. Finally, we show that the proposed ERASE-Net can be compressed by 40% without significant loss in performance, significantly more than the SOTA network, which makes it a more promising candidate for practical automotive applications.
We close three open problems in the separation complexity of valid inequalities for the knapsack polytope. Specifically, we establish that the separation problems for extended cover inequalities, (1,k)-configuration inequalities, and … We close three open problems in the separation complexity of valid inequalities for the knapsack polytope. Specifically, we establish that the separation problems for extended cover inequalities, (1,k)-configuration inequalities, and weight inequalities are all NP-complete. We also give a number of special cases where the separation problem can be solved in polynomial time.
We propose a method to generate cutting-planes from multiple covers of knapsack constraints. The covers may come from different knapsack inequalities if the weights in the inequalities form a totally-ordered … We propose a method to generate cutting-planes from multiple covers of knapsack constraints. The covers may come from different knapsack inequalities if the weights in the inequalities form a totally-ordered set. Thus, we introduce and study the structure of a totally-ordered multiple knapsack set. The valid multi-cover inequalities we derive for its convex hull have a number of interesting properties. First, they generalize the well-known (1, k)-configuration inequalities. Second, they are not aggregation cuts. Third, they cannot be generated as a rank-1 Chvatal-Gomory cut from the inequality system consisting of the knapsack constraints and all their minimal cover inequalities. We also provide conditions under which the inequalities are facets for the convex hull of the totally-ordered knapsack set, as well as conditions for those inequalities to fully characterize its convex hull. We give an integer program to solve the separation and provide numerical experiments that showcase the strength of these new inequalities.
The complexity class DP is the class of all languages that are the intersection of a language in NP and a language in co-NP, as coined by Papadimitriou and Yannakakis … The complexity class DP is the class of all languages that are the intersection of a language in NP and a language in co-NP, as coined by Papadimitriou and Yannakakis (1982). Hartvigsen and Zemel (1992) conjectured that recognizing a facet for the knapsack polytope is DP-complete. While it has been known that the recognition problems of facets for polytopes associated with other well-known combinatorial optimization problems, e.g., traveling salesman, node/set packing/covering, are DP-complete, this conjecture on recognizing facets for the knapsack polytope remains open. We provide a positive answer to this conjecture. Moreover, despite the DP-hardness of the recognition problem, we give a polynomial time algorithm for deciding if an inequality with a fixed number of distinct positive coefficients defines a facet of a knapsack polytope, generalizing a result of Balas (1975).
The complementarity knapsack problem (CKP) is a knapsack problem with real-valued variables and complementarity conditions between pairs of its variables. We extend the polyhedral studies of De Farias et al. … The complementarity knapsack problem (CKP) is a knapsack problem with real-valued variables and complementarity conditions between pairs of its variables. We extend the polyhedral studies of De Farias et al. for CKP, by proposing three new families of cutting-planes that are all obtained from a combinatorial concept known as a pack. Sufficient conditions for these inequalities to be facet-defining, based on the concept of a maximal switching pack, are also provided. Moreover, we answer positively a conjecture by de Farias et~al.~about the separation complexity of the inequalities introduced in their work, and propose efficient separation algorithms for our newly defined cutting-planes.
We study an optimization problem originated from the Grothendieck constant. A generalized normal equation is proposed and analyzed. We establish a correspondence between solutions of the general normal equation and … We study an optimization problem originated from the Grothendieck constant. A generalized normal equation is proposed and analyzed. We establish a correspondence between solutions of the general normal equation and its dual equation. Explicit solutions are described for the two-dimensional case.
Tiny objects, with their limited spatial resolution, often resemble point-like distributions. As a result, bounding box prediction using point-level supervision emerges as a natural and cost-effective alternative to traditional box-level … Tiny objects, with their limited spatial resolution, often resemble point-like distributions. As a result, bounding box prediction using point-level supervision emerges as a natural and cost-effective alternative to traditional box-level supervision. However, the small scale and lack of distinctive features of tiny objects make point annotations prone to noise, posing significant hurdles for model robustness. To tackle these challenges, we propose Point Teacher--the first end-to-end point-supervised method for robust tiny object detection in aerial images. To handle label noise from scale ambiguity and location shifts in point annotations, Point Teacher employs the teacher-student architecture and decouples the learning into a two-phase denoising process. In this framework, the teacher network progressively denoises the pseudo boxes derived from noisy point annotations, guiding the student network's learning. Specifically, in the first phase, random masking of image regions facilitates regression learning, enabling the teacher to transform noisy point annotations into coarse pseudo boxes. In the second phase, these coarse pseudo boxes are refined using dynamic multiple instance learning, which adaptively selects the most reliable instance from dynamically constructed proposal bags around the coarse pseudo boxes. Extensive experiments on three tiny object datasets (i.e., AI-TOD-v2, SODA-A, and TinyPerson) validate the proposed method's effectiveness and robustness against point location shifts. Notably, relying solely on point supervision, our Point Teacher already shows comparable performance with box-supervised learning methods. Codes and models will be made publicly available.
The target-tracking accuracy of autonomous vehicles is closely related to that of onboard sensors. Methods such as image processing and base station positioning are susceptible to various types of interference … The target-tracking accuracy of autonomous vehicles is closely related to that of onboard sensors. Methods such as image processing and base station positioning are susceptible to various types of interference in real-world scenarios, resulting in sensor data errors or even losses that ultimately affect the tracking accuracy of autonomous vehicles. This study proposes a target-tracking control method that relies solely on wheel odometry to address this issue. This method incorporates an extended state observer to compensate for the cumulative errors generated by the odometry mechanism, effectively enhancing the robustness and accuracy of the system in complex environments. In addition, a hyperbolic-tangent line-of-sight guidance strategy based on a partition-switching mechanism is designed to improve the dynamic response capability of an autonomous vehicle. This strategy nonlinearly adjusts the tracking error to generate the desired heading angle and velocity, ensuring that the target path tracking is rapid and smooth. First, we establish a mathematical model of an autonomous vehicle and combine the hyperbolic-tangent line-of-sight guidance strategy with a noise-resistant active disturbance rejection controller to achieve high-precision target tracking in dynamic environments. Second, an extended state observer is employed to perform real-time observations and compensate for unknown disturbances during localization, significantly reducing the impact of cumulative errors. Finally, the effectiveness of the proposed method is validated using numerical simulations and real vehicle experiments. The experimental results demonstrate that, compared with the ET-Fuzzy-MPC method, the proposed method lowered the average position tracking error by 45.39% under complex road conditions. In practical curved-path tests, the vehicle's tracking error remained stable to within 0.192 m, representing a significant improvement in the target tracking accuracy and dynamic response performance.
The target-tracking accuracy of autonomous vehicles is closely related to that of onboard sensors. Methods such as image processing and base station positioning are susceptible to various types of interference … The target-tracking accuracy of autonomous vehicles is closely related to that of onboard sensors. Methods such as image processing and base station positioning are susceptible to various types of interference in real-world scenarios, resulting in sensor data errors or even losses that ultimately affect the tracking accuracy of autonomous vehicles. This study proposes a target-tracking control method that relies solely on wheel odometry to address this issue. This method incorporates an extended state observer to compensate for the cumulative errors generated by the odometry mechanism, effectively enhancing the robustness and accuracy of the system in complex environments. In addition, a hyperbolic-tangent line-of-sight guidance strategy based on a partition-switching mechanism is designed to improve the dynamic response capability of an autonomous vehicle. This strategy nonlinearly adjusts the tracking error to generate the desired heading angle and velocity, ensuring that the target path tracking is rapid and smooth. First, we establish a mathematical model of an autonomous vehicle and combine the hyperbolic-tangent line-of-sight guidance strategy with a noise-resistant active disturbance rejection controller to achieve high-precision target tracking in dynamic environments. Second, an extended state observer is employed to perform real-time observations and compensate for unknown disturbances during localization, significantly reducing the impact of cumulative errors. Finally, the effectiveness of the proposed method is validated using numerical simulations and real vehicle experiments. The experimental results demonstrate that, compared with the ET-Fuzzy-MPC method, the proposed method lowered the average position tracking error by 45.39% under complex road conditions. In practical curved-path tests, the vehicle's tracking error remained stable to within 0.192 m, representing a significant improvement in the target tracking accuracy and dynamic response performance.
Tiny objects, with their limited spatial resolution, often resemble point-like distributions. As a result, bounding box prediction using point-level supervision emerges as a natural and cost-effective alternative to traditional box-level … Tiny objects, with their limited spatial resolution, often resemble point-like distributions. As a result, bounding box prediction using point-level supervision emerges as a natural and cost-effective alternative to traditional box-level supervision. However, the small scale and lack of distinctive features of tiny objects make point annotations prone to noise, posing significant hurdles for model robustness. To tackle these challenges, we propose Point Teacher--the first end-to-end point-supervised method for robust tiny object detection in aerial images. To handle label noise from scale ambiguity and location shifts in point annotations, Point Teacher employs the teacher-student architecture and decouples the learning into a two-phase denoising process. In this framework, the teacher network progressively denoises the pseudo boxes derived from noisy point annotations, guiding the student network's learning. Specifically, in the first phase, random masking of image regions facilitates regression learning, enabling the teacher to transform noisy point annotations into coarse pseudo boxes. In the second phase, these coarse pseudo boxes are refined using dynamic multiple instance learning, which adaptively selects the most reliable instance from dynamically constructed proposal bags around the coarse pseudo boxes. Extensive experiments on three tiny object datasets (i.e., AI-TOD-v2, SODA-A, and TinyPerson) validate the proposed method's effectiveness and robustness against point location shifts. Notably, relying solely on point supervision, our Point Teacher already shows comparable performance with box-supervised learning methods. Codes and models will be made publicly available.
We study an optimization problem originated from the Grothendieck constant. A generalized normal equation is proposed and analyzed. We establish a correspondence between solutions of the general normal equation and … We study an optimization problem originated from the Grothendieck constant. A generalized normal equation is proposed and analyzed. We establish a correspondence between solutions of the general normal equation and its dual equation. Explicit solutions are described for the two-dimensional case.
Precise detection of tiny objects in remote sensing imagery remains a significant challenge due to their limited visual information and frequent occurrence within scenes. This challenge is further exacerbated by … Precise detection of tiny objects in remote sensing imagery remains a significant challenge due to their limited visual information and frequent occurrence within scenes. This challenge is further exacerbated by the practical burden and inherent errors associated with manual annotation: annotating tiny objects is laborious and prone to errors (i.e., label noise). Training detectors for such objects using noisy labels often leads to suboptimal performance, with networks tending to overfit on noisy labels. In this study, we address the intricate issue of tiny object detection under noisy label supervision. We systematically investigate the impact of various types of noise on network training, revealing the vulnerability of object detectors to class shifts and inaccurate bounding boxes for tiny objects. To mitigate these challenges, we propose a DeNoising Tiny Object Detector (DN-TOD), which incorporates a Class-aware Label Correction (CLC) scheme to address class shifts and a Trend-guided Learning Strategy (TLS) to handle bounding box noise. CLC mitigates inaccurate class supervision by identifying and filtering out class-shifted positive samples, while TLS reduces noisy box-induced erroneous supervision through sample reweighting and bounding box regeneration. Additionally, Our method can be seamlessly integrated into both one-stage and two-stage object detection pipelines. Comprehensive experiments conducted on synthetic (i.e., noisy AI-TOD-v2.0 and DOTA-v2.0) and real-world (i.e., AI-TOD) noisy datasets demonstrate the robustness of DN-TOD under various types of label noise. Notably, when applied to the strong baseline RFLA, DN-TOD exhibits a noteworthy performance improvement of 4.9 points under 40% mixed noise. Datasets, codes, and models will be made publicly available.
Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent … Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent development of radar technologies and open-sourced annotated data sets, semantic segmentation with radar signals has become very promising. However, existing methods are either computationally expensive or discard significant amounts of valuable information from raw 3D radar signals by reducing them to 2D planes via averaging. In this work, we introduce ERASE-Net, an Efficient RAdar SEgmentation Network to segment the raw radar signals semantically. The core of our approach is the novel detect-then-segment method for raw radar signals. It first detects the center point of each object, then extracts a compact radar signal representation, and finally performs semantic segmentation. We show that our method can achieve superior performance on radar semantic segmentation task compared to the state-of-the-art (SOTA) technique. Furthermore, our approach requires up to 20×less computational resources. Finally, we show that the proposed ERASE-Net can be compressed by 40% without significant loss in performance, significantly more than the SOTA network, which makes it a more promising candidate for practical automotive applications.
We study relaxations for linear programs with complementarity constraints, especially instances whose complementary pairs of variables are not independent. Our formulation is based on identifying vertex covers of the conflict … We study relaxations for linear programs with complementarity constraints, especially instances whose complementary pairs of variables are not independent. Our formulation is based on identifying vertex covers of the conflict graph of the instance and generalizes the extended reformulation-linearization technique of Nguyen, Richard, and Tawarmalani to instances with general complementarity conditions between variables. We demonstrate how to obtain strong cutting planes for our formulation from both the stable set polytope and the boolean quadric polytope associated with a complete bipartite graph. Through an extensive computational study for three types of practical problems, we assess the performance of our proposed linear relaxation and new cutting-planes in terms of the optimality gap closed.
Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent … Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent development of radar technologies and open-sourced annotated data sets, semantic segmentation with radar signals has become very promising. However, existing methods are either computationally expensive or discard significant amounts of valuable information from raw 3D radar signals by reducing them to 2D planes via averaging. In this work, we introduce ERASE-Net, an Efficient RAdar SEgmentation Network to segment the raw radar signals semantically. The core of our approach is the novel detect-then-segment method for raw radar signals. It first detects the center point of each object, then extracts a compact radar signal representation, and finally performs semantic segmentation. We show that our method can achieve superior performance on radar semantic segmentation task compared to the state-of-the-art (SOTA) technique. Furthermore, our approach requires up to 20x less computational resources. Finally, we show that the proposed ERASE-Net can be compressed by 40% without significant loss in performance, significantly more than the SOTA network, which makes it a more promising candidate for practical automotive applications.
The complexity class DP is the class of all languages that are the intersection of a language in NP and a language in co-NP, as coined by Papadimitriou and Yannakakis … The complexity class DP is the class of all languages that are the intersection of a language in NP and a language in co-NP, as coined by Papadimitriou and Yannakakis (1982). Hartvigsen and Zemel (1992) conjectured that recognizing a facet for the knapsack polytope is DP-complete. While it has been known that the recognition problems of facets for polytopes associated with other well-known combinatorial optimization problems, e.g., traveling salesman, node/set packing/covering, are DP-complete, this conjecture on recognizing facets for the knapsack polytope remains open. We provide a positive answer to this conjecture. Moreover, despite the DP-hardness of the recognition problem, we give a polynomial time algorithm for deciding if an inequality with a fixed number of distinct positive coefficients defines a facet of a knapsack polytope, generalizing a result of Balas (1975).
The complementarity knapsack problem (CKP) is a knapsack problem with real-valued variables and complementarity conditions between pairs of its variables. We extend the polyhedral studies of De Farias et al. … The complementarity knapsack problem (CKP) is a knapsack problem with real-valued variables and complementarity conditions between pairs of its variables. We extend the polyhedral studies of De Farias et al. for CKP, by proposing three new families of cutting-planes that are all obtained from a combinatorial concept known as a pack. Sufficient conditions for these inequalities to be facet-defining, based on the concept of a maximal switching pack, are also provided. Moreover, we answer positively a conjecture by de Farias et~al.~about the separation complexity of the inequalities introduced in their work, and propose efficient separation algorithms for our newly defined cutting-planes.
We close three open problems in the separation complexity of valid inequalities for the knapsack polytope. Specifically, we establish that the separation problems for extended cover inequalities, (1,k)-configuration inequalities, and … We close three open problems in the separation complexity of valid inequalities for the knapsack polytope. Specifically, we establish that the separation problems for extended cover inequalities, (1,k)-configuration inequalities, and weight inequalities are all NP-complete. We also give a number of special cases where the separation problem can be solved in polynomial time.
We close three open problems in the separation complexity of valid inequalities for the knapsack polytope. Specifically, we establish that the separation problems for extended cover inequalities, (1,k)-configuration inequalities, and … We close three open problems in the separation complexity of valid inequalities for the knapsack polytope. Specifically, we establish that the separation problems for extended cover inequalities, (1,k)-configuration inequalities, and weight inequalities are all NP-complete. We also give a number of special cases where the separation problem can be solved in polynomial time.
We propose a method to generate cutting-planes from multiple covers of knapsack constraints. The covers may come from different knapsack inequalities if the weights in the inequalities form a totally-ordered … We propose a method to generate cutting-planes from multiple covers of knapsack constraints. The covers may come from different knapsack inequalities if the weights in the inequalities form a totally-ordered set. Thus, we introduce and study the structure of a totally-ordered multiple knapsack set. The valid multi-cover inequalities we derive for its convex hull have a number of interesting properties. First, they generalize the well-known (1, k)-configuration inequalities. Second, they are not aggregation cuts. Third, they cannot be generated as a rank-1 Chvatal-Gomory cut from the inequality system consisting of the knapsack constraints and all their minimal cover inequalities. We also provide conditions under which the inequalities are facets for the convex hull of the totally-ordered knapsack set, as well as conditions for those inequalities to fully characterize its convex hull. We give an integer program to solve the separation and provide numerical experiments that showcase the strength of these new inequalities.
We relate the nonlocal properties of noisy entangled states to Grothendieck's constant, a mathematical constant appearing in Banach space theory. For two-qubit Werner states ${\ensuremath{\rho}}_{p}^{W}=p\ensuremath{\mid}{\ensuremath{\psi}}^{\ensuremath{-}}⟩⟨{\ensuremath{\psi}}^{\ensuremath{-}}\ensuremath{\mid}+(1\ensuremath{-}p)\mathbb{1}∕4$, we show that there is … We relate the nonlocal properties of noisy entangled states to Grothendieck's constant, a mathematical constant appearing in Banach space theory. For two-qubit Werner states ${\ensuremath{\rho}}_{p}^{W}=p\ensuremath{\mid}{\ensuremath{\psi}}^{\ensuremath{-}}⟩⟨{\ensuremath{\psi}}^{\ensuremath{-}}\ensuremath{\mid}+(1\ensuremath{-}p)\mathbb{1}∕4$, we show that there is a local model for projective measurements if and only if $p\ensuremath{\leqslant}1∕{K}_{G}(3)$, where ${K}_{G}(3)$ is Grothendieck's constant of order 3. Known bounds on ${K}_{G}(3)$ prove the existence of this model at least for $p\ensuremath{\lesssim}0.66$, quite close to the current region of Bell violation, $p\ensuremath{\sim}0.71$. We generalize this result to arbitrary quantum states.
Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014. Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014.
It has been widely accepted that Long Short-Term Memory (LSTM) network, coupled with attention mechanism and memory module, is useful for aspect-level sentiment classification. However, existing approaches largely rely on … It has been widely accepted that Long Short-Term Memory (LSTM) network, coupled with attention mechanism and memory module, is useful for aspect-level sentiment classification. However, existing approaches largely rely on the modelling of semantic relatedness of an aspect with its context words, while to some extent ignore their syntactic dependencies within sentences. Consequently, this may lead to an undesirable result that the aspect attends on contextual words that are descriptive of other aspects. In this paper, we propose a proximity-weighted convolution network to offer an aspect-specific syntax-aware representation of contexts. In particular, two ways of determining proximity weight are explored, namely position proximity and dependency proximity. The representation is primarily abstracted by a bidirectional LSTM architecture and further enhanced by a proximity-weighted convolution. Experiments conducted on the SemEval 2014 benchmark demonstrate the effectiveness of our proposed approach compared with a range of state-of-the-art models is available at https://github.com/GeneZC/PWCN.
Aspect based sentiment analysis (ABSA) can provide more detailed information than general sentiment analysis, because it aims to predict the sentiment polarities of the given aspects or entities in text. … Aspect based sentiment analysis (ABSA) can provide more detailed information than general sentiment analysis, because it aims to predict the sentiment polarities of the given aspects or entities in text. We summarize previous approaches into two subtasks: aspect-category sentiment analysis (ACSA) and aspect-term sentiment analysis (ATSA). Most previous approaches employ long short-term memory and attention mechanisms to predict the sentiment polarity of the concerned targets, which are often complicated and need more training time. We propose a model based on convolutional neural networks and gating mechanisms, which is more accurate and efficient. First, the novel Gated Tanh-ReLU Units can selectively output the sentiment features according to the given aspect or entity. The architecture is much simpler than attention layer used in the existing models. Second, the computations of our model could be easily parallelized during training, because convolutional layers do not have time dependency as in LSTM layers, and gating units also work independently. The experiments on SemEval datasets demonstrate the efficiency and effectiveness of our models.
The deployment of deep convolutional neural networks (CNNs) in many real world applications is largely hindered by their high computational cost. In this paper, we propose a novel learning scheme … The deployment of deep convolutional neural networks (CNNs) in many real world applications is largely hindered by their high computational cost. In this paper, we propose a novel learning scheme for CNNs to simultaneously 1) reduce the model size; 2) decrease the run-time memory footprint; and 3) lower the number of computing operations, without compromising accuracy. This is achieved by enforcing channel-level sparsity in the network in a simple but effective way. Different from many existing approaches, the proposed method directly applies to modern CNN architectures, introduces minimum overhead to the training process, and requires no special software/hardware accelerators for the resulting models. We call our approach network slimming, which takes wide and large networks as input models, but during training insignificant channels are automatically identified and pruned afterwards, yielding thin and compact models with comparable accuracy. We empirically demonstrate the effectiveness of our approach with several state-of-the-art CNN models, including VGGNet, ResNet and DenseNet, on various image classification datasets. For VGGNet, a multi-pass version of network slimming gives a 20× reduction in model size and a 5× reduction in computing operations.
Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced … Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set. By exploiting metric space distances, our network is able to learn local features with increasing contextual scales. With further observation that point sets are usually sampled with varying densities, which results in greatly decreased performance for networks trained on uniform densities, we propose novel set learning layers to adaptively combine features from multiple scales. Experiments show that our network called PointNet++ is able to learn deep point set features efficiently and robustly. In particular, results significantly better than state-of-the-art have been obtained on challenging benchmarks of 3D point clouds.
In many robotics and VR/AR applications, 3D-videos are readily-available input sources (a sequence of depth images, or LIDAR scans). However, in many cases, the 3D-videos are processed frame-by-frame either through … In many robotics and VR/AR applications, 3D-videos are readily-available input sources (a sequence of depth images, or LIDAR scans). However, in many cases, the 3D-videos are processed frame-by-frame either through 2D convnets or 3D perception algorithms. In this work, we propose 4-dimensional convolutional neural networks for spatio-temporal perception that can directly process such 3D-videos using high-dimensional convolutions. For this, we adopt sparse tensors and propose generalized sparse convolutions that encompass all discrete convolutions. To implement the generalized sparse convolution, we create an open-source auto-differentiation library for sparse tensors that provides extensive functions for high-dimensional convolutional neural networks. We create 4D spatio-temporal convolutional neural networks using the library and validate them on various 3D semantic segmentation benchmarks and proposed 4D datasets for 3D-video perception. To overcome challenges in 4D space, we propose the hybrid kernel, a special case of the generalized sparse convolution, and trilateral-stationary conditional random fields that enforce spatio-temporal consistency in the 7D space-time-chroma space. Experimentally, we show that a convolutional neural network with only generalized 3D sparse convolutions can outperform 2D or 2D-3D hybrid methods by a large margin. Also, we show that on 3D-videos, 4D spatio-temporal convolutional neural networks are robust to noise and outperform the 3D convolutional neural network.
In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across … In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.
We introduce a deep memory network for aspect level sentiment classification.Unlike feature-based SVM and sequential neural models such as LSTM, this approach explicitly captures the importance of each context word … We introduce a deep memory network for aspect level sentiment classification.Unlike feature-based SVM and sequential neural models such as LSTM, this approach explicitly captures the importance of each context word when inferring the sentiment polarity of an aspect.Such importance degree and text representation are calculated with multiple computational layers, each of which is a neural attention model over an external memory.Experiments on laptop and restaurant datasets demonstrate that our approach performs comparable to state-of-art feature based SVM system, and substantially better than LSTM and attention-based LSTM architectures.On both datasets we show that multiple computational layers could improve the performance.Moreover, our approach is also fast.The deep memory network with 9 layers is 15 times faster than LSTM with a CPU implementation.
Convolutional networks are the de-facto standard for analyzing spatio-temporal data such as images, videos, and 3D shapes. Whilst some of this data is naturally dense (e.g., photos), many other data … Convolutional networks are the de-facto standard for analyzing spatio-temporal data such as images, videos, and 3D shapes. Whilst some of this data is naturally dense (e.g., photos), many other data sources are inherently sparse. Examples include 3D point clouds that were obtained using a LiDAR scanner or RGB-D camera. Standard "dense" implementations of convolutional networks are very inefficient when applied on such sparse data. We introduce new sparse convolutional operations that are designed to process spatially-sparse data more efficiently, and use them to develop spatially-sparse convolutional networks. We demonstrate the strong performance of the resulting models, called submanifold sparse convolutional networks (SS-CNs), on two tasks involving semantic segmentation of 3D point clouds. In particular, our models outperform all prior state-of-the-art on the test set of a recent semantic segmentation competition.
Probably the most famous of Grothendieck's contributions to Banach space theory is the result that he himself described as "the fundamental theorem in the metric theory of tensor products". That … Probably the most famous of Grothendieck's contributions to Banach space theory is the result that he himself described as "the fundamental theorem in the metric theory of tensor products". That is now commonly referred to as "Grothendieck's theorem" ("GT" for short), or sometimes as "Grothendieck's inequality". This had a major impact first in Banach space theory (roughly after 1968), then, later on, in $C^*$-algebra theory (roughly after 1978). More recently, in this millennium, a new version of GT has been successfully developed in the framework of "operator spaces" or non-commutative Banach spaces. In addition, GT independently surfaced in several quite unrelated fields: in connection with Bell's inequality in quantum mechanics, in graph theory where the Grothendieck constant of a graph has been introduced and in computer science where the Grothendieck inequality is invoked to replace certain NP hard problems by others that can be treated by "semidefinite programming" and hence solved in polynomial time. This expository paper (where many proofs are included), presents a review of all these topics, starting from the original GT. We concentrate on the more recent developments and merely outline those of the first Banach space period since detailed accounts of that are already available, for instance the author's 1986 CBMS notes.
Aspect-level sentiment classification aims at identifying the sentiment polarity of specific target in its context. Previous approaches have realized the importance of targets in sentiment classification and developed various methods … Aspect-level sentiment classification aims at identifying the sentiment polarity of specific target in its context. Previous approaches have realized the importance of targets in sentiment classification and developed various methods with the goal of precisely modeling thier contexts via generating target-specific representations. However, these studies always ignore the separate modeling of targets. In this paper, we argue that both targets and contexts deserve special treatment and need to be learned their own representations via interactive learning. Then, we propose the interactive attention networks (IAN) to interactively learn attentions in the contexts and targets, and generate the representations for targets and contexts separately. With this design, the IAN model can well represent a target and its collocative context, which is helpful to sentiment classification. Experimental results on SemEval 2014 Datasets demonstrate the effectiveness of our model.
Abstract We survey connections of the Grothendieck inequality and its variants to combinatorial optimization and computational complexity. © 2011 Wiley Periodicals, Inc. Abstract We survey connections of the Grothendieck inequality and its variants to combinatorial optimization and computational complexity. © 2011 Wiley Periodicals, Inc.
Targeted sentiment analysis is the task of jointly predicting target entities and their associated sentiment information. Existing research efforts mostly regard this joint task as a sequence labeling problem, building … Targeted sentiment analysis is the task of jointly predicting target entities and their associated sentiment information. Existing research efforts mostly regard this joint task as a sequence labeling problem, building models that can capture explicit structures in the output space. However, the importance of capturing implicit global structural information that resides in the input space is largely unexplored. In this work, we argue that both types of information (implicit and explicit structural information) are crucial for building a successful targeted sentiment analysis model. Our experimental results show that properly capturing both information is able to lead to better performance than competitive existing approaches. We also conduct extensive experiments to investigate our model's effectiveness and robustness.
Chen Zhang, Qiuchi Li, Dawei Song. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. Chen Zhang, Qiuchi Li, Dawei Song. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.
Model efficiency has become increasingly important in computer vision. In this paper, we systematically study neural network architecture design choices for object detection and propose several key optimizations to improve … Model efficiency has become increasingly important in computer vision. In this paper, we systematically study neural network architecture design choices for object detection and propose several key optimizations to improve efficiency. First, we propose a weighted bi-directional feature pyramid network (BiFPN), which allows easy and fast multi-scale feature fusion; Second, we propose a compound scaling method that uniformly scales the resolution, depth, and width for all backbone, feature network, and box/class prediction networks at the same time. Based on these optimizations and EfficientNet backbones, we have developed a new family of object detectors, called EfficientDet, which consistently achieve much better efficiency than prior art across a wide spectrum of resource constraints. In particular, with single-model and single-scale, our EfficientDet-D7 achieves state-of-the-art 52.2 AP on COCO test-dev with 52M parameters and 325B FLOPs1, being 4x - 9x smaller and using 13x - 42x fewer FLOPs than previous detector. Code is available at https://github.com/google/ automl/tree/master/efficientdet.
Aspect-based sentiment analysis aims to determine the sentiment polarity towards a specific aspect in online reviews. Most recent efforts adopt attention-based neural network models to implicitly connect aspects with opinion … Aspect-based sentiment analysis aims to determine the sentiment polarity towards a specific aspect in online reviews. Most recent efforts adopt attention-based neural network models to implicitly connect aspects with opinion words. However, due to the complexity of language and the existence of multiple aspects in a single sentence, these models often confuse the connections. In this paper, we address this problem by means of effective encoding of syntax information. Firstly, we define a unified aspect-oriented dependency tree structure rooted at a target aspect by reshaping and pruning an ordinary dependency parse tree. Then, we propose a relational graph attention network (R-GAT) to encode the new tree structure for sentiment prediction. Extensive experiments are conducted on the SemEval 2014 and Twitter datasets, and the experimental results confirm that the connections between aspects and opinion words can be better established with our approach, and the performance of the graph attention network (GAT) is significantly improved as a consequence.
Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object detection, … Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment. Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar. As machine learning based methods for detection and tracking become more prevalent, there is a need to train and evaluate such methods on datasets containing range sensor data along with images. In this work we present nuTonomy scenes (nuScenes), the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes. It has 7x as many annotations and 100x as many images as the pioneering KITTI dataset. We define novel 3D detection and tracking metrics. We also provide careful dataset analysis as well as baselines for lidar and image based detection and tracking. Data, development kit and more information are available online.
Datasets for autonomous cars are essential for the development and benchmarking of perception systems. However, most existing datasets are captured with camera and LiDAR sensors in good weather conditions. In … Datasets for autonomous cars are essential for the development and benchmarking of perception systems. However, most existing datasets are captured with camera and LiDAR sensors in good weather conditions. In this paper, we present the RAdar Dataset In Adverse weaThEr (RADIATE), aiming to facilitate research on object detection, tracking and scene understanding using radar sensing for safe autonomous driving. RADIATE includes 3 hours of annotated radar images with more than 200K labelled road actors in total, on average about 4.6 instances per radar image. It covers 8 different categories of actors in a variety of weather conditions (e.g., sun, night, rain, fog and snow) and driving scenarios (e.g., parked, urban, motorway and suburban), representing different levels of challenge. To the best of our knowledge, this is the first public radar dataset which provides high-resolution radar images on public roads with a large amount of road actors labelled. The data collected in adverse weather, e.g., fog and snowfall, is unique. Some baseline results of radar based object detection and recognition are given to show that the use of radar data is promising for automotive applications in bad weather, where vision and LiDAR can fail. RADIATE also has stereo images, 32-channel LiDAR and GPS data, directed at other applications such as sensor fusion, localisation and mapping. The public dataset can be accessed at this http URL.
In this work, we propose the use of radar with advanced deep segmentation models to identify open space in parking scenarios. A publically available dataset of radar observations called SCORP … In this work, we propose the use of radar with advanced deep segmentation models to identify open space in parking scenarios. A publically available dataset of radar observations called SCORP was collected. Deep models are evaluated with various radar input representations. Our proposed approach achieves low memory usage and real-time processing speeds, and is thus very well suited for embedded deployment.
This paper presents an efficient annotation procedure and an application thereof to end-to-end, rich semantic segmentation of the sensed environment using Frequency-Modulated Continuous-Wave scanning radar. We advocate radar over the … This paper presents an efficient annotation procedure and an application thereof to end-to-end, rich semantic segmentation of the sensed environment using Frequency-Modulated Continuous-Wave scanning radar. We advocate radar over the traditional sensors used for this task as it operates at longer ranges and is substantially more robust to adverse weather and illumination conditions. We avoid laborious manual labelling by exploiting the largest radar-focused urban autonomy dataset collected to date, correlating radar scans with RGB cameras and LiDAR sensors, for which semantic segmentation is an already consolidated procedure. The training procedure leverages a state-of-the-art natural image segmentation system which is publicly available and as such, in contrast to previous approaches, allows for the production of copious labels for the radar stream by incorporating four camera and two LiDAR streams. Additionally, the losses are computed taking into account labels to the radar sensor horizon by accumulating LiDAR returns along a pose-chain ahead and behind of the current vehicle position. Finally, we present the network with multi-channel radar scan inputs in order to deal with ephemeral and dynamic scene objects.
Radar is usually more robust than the camera in severe driving scenarios, e.g., weak/strong lighting and bad weather. However, unlike RGB images captured by a camera, the semantic information from … Radar is usually more robust than the camera in severe driving scenarios, e.g., weak/strong lighting and bad weather. However, unlike RGB images captured by a camera, the semantic information from the radar signals is noticeably difficult to extract. In this paper, we propose a deep radar object detection network (RODNet), to effectively detect objects purely from the carefully processed radar frequency data in the format of range-azimuth frequency heatmaps (RAMaps). Three different 3D autoencoder based architectures are introduced to predict object confidence distribution from each snippet of the input RAMaps. The final detection results are then calculated using our post-processing method, called location-based non-maximum suppression (L-NMS). Instead of using burdensome human-labeled ground truth, we train the RODNet using the annotations generated automatically by a novel 3D localization method using a camera-radar fusion (CRF) strategy. To train and evaluate our method, we build a new dataset - CRUW, containing synchronized videos and RAMaps in various driving scenarios. After intensive experiments, our RODNet shows favorable object detection performance without the presence of the camera.
Camera and Lidar processing have been revolutionized with the rapid development of deep learning model architectures. Automotive radar is one of the crucial elements of automated driver assistance and autonomous … Camera and Lidar processing have been revolutionized with the rapid development of deep learning model architectures. Automotive radar is one of the crucial elements of automated driver assistance and autonomous driving systems. Radar still relies on traditional signal processing techniques, unlike camera and Lidar based methods. We believe this is the missing link to achieve the most robust perception system. Identifying drivable space and occupied space is the first step in any autonomous decision making task. Occupancy grid map representation of the environment is often used for this purpose. In this paper, we propose PolarNet, a deep neural model to process radar information in polar domain for open space segmentation. We explore various input-output representations. Our experiments show that PolarNet is a effective way to process radar data that achieves state-of-the-art performance and processing speeds while maintaining a compact size.
High quality perception is essential for autonomous driving (AD) systems. To reach the accuracy and robustness thatare required by such systems, several types of sensors must be combined. Currently, mostly … High quality perception is essential for autonomous driving (AD) systems. To reach the accuracy and robustness thatare required by such systems, several types of sensors must be combined. Currently, mostly cameras and laser scanners (lidar) are deployed to build a representation of the world around the vehicle. While radar sensors have been used fora long time in the automotive industry, they are still under-used for AD despite their appealing characteristics (notably, their ability to measure the relative speed of obstacles and to operate even in adverse weather conditions). To alarge extent, this situation is due to the relative lack of automotive datasets with real radar signals that are both raw and annotated. In this work, we introduce CARRADA, a dataset of synchronized camera and radar recordings with range-angle-Doppler annotations. We also present a semi-automatic annotation approach, which was used to annotate the dataset, and a radar semantic segmentation baseline, which we evaluate on several metrics. Both our code and dataset are available online.
This letter proposes a differentiator for sampled signals with bounded noise and bounded second derivative. It is based on a linear program derived from the available sample information and requires … This letter proposes a differentiator for sampled signals with bounded noise and bounded second derivative. It is based on a linear program derived from the available sample information and requires no further tuning beyond the noise and derivative bounds. A tight bound on the worst-case accuracy, i.e., the worst-case differentiation error, is derived, which is the best among all causal differentiators and is moreover shown to be obtained after a fixed number of sampling steps. Comparisons with the accuracy of existing high-gain and sliding-mode differentiators illustrate the obtained results.
Aspect Sentiment Triplet Extraction (ASTE) aims to extract triplets from sentences, where each triplet includes an entity, its associated sentiment, and the opinion span explaining the reason for the sentiment.Most … Aspect Sentiment Triplet Extraction (ASTE) aims to extract triplets from sentences, where each triplet includes an entity, its associated sentiment, and the opinion span explaining the reason for the sentiment.Most existing research addresses this problem in a multi-stage pipeline manner, which neglects the mutual information between such three elements and has the problem of error propagation.In this paper, we propose a Semantic and Syntactic Enhanced aspect Sentiment triplet Extraction model (S 3 E 2 ) to fully exploit the syntactic and semantic relationships between the triplet elements and jointly extract them.Specifically, we design a Graph-Sequence duel representation and modeling paradigm for the task of ASTE: we represent the semantic and syntactic relationships between word pairs in a sentence by graph and encode it by Graph Neural Networks (GNNs), as well as modeling the original sentence by LSTM to preserve the sequential information.Under this setting, we further apply a more efficient inference strategy for the extraction of triplets.Extensive evaluations on four benchmark datasets show that S 3 E 2 significantly outperforms existing approaches, which proves our S 3 E 2 's superiority and flexibility in an end-to-end fashion.
Object detection using automotive radars has not been explored with deep learning models in comparison to the camera based approaches. This can be attributed to the lack of public radar … Object detection using automotive radars has not been explored with deep learning models in comparison to the camera based approaches. This can be attributed to the lack of public radar datasets. In this paper, we collect a novel radar dataset that contains radar data in the form of Range-AzimuthDoppler tensors along with the bounding boxes on the tensor for dynamic road users, category labels, and 2D bounding boxes on the Cartesian Bird-Eye-View range map. To build the dataset, we propose an instance-wise auto-annotation method. Furthermore, a novel Range-Azimuth-Doppler based multiclass object detection deep learning model is proposed. The algorithm is a one-stage anchor-based detector that generates both 3D bounding boxes and 2D bounding boxes on RangeAzimuth-Doppler and Cartesian domains, respectively. Our proposed algorithm achieves 56.3% AP with IOU of 0.3 on 3D bounding box predictions, and 51.6% with IOU of 0.5 on 2D bounding box prediction. Our dataset and the code can be found at https://github.com/ZhangAoCanada/RADDet.git.
Dependency parse trees are helpful for discovering the opinion words in aspect-based sentiment analysis (ABSA) (CITATION). However, the trees obtained from off-the-shelf dependency parsers are static, and could be sub-optimal … Dependency parse trees are helpful for discovering the opinion words in aspect-based sentiment analysis (ABSA) (CITATION). However, the trees obtained from off-the-shelf dependency parsers are static, and could be sub-optimal in ABSA. This is because the syntactic trees are not designed for capturing the interactions between opinion words and aspect words. In this work, we aim to shorten the distance between aspects and corresponding opinion words by learning an aspect-centric tree structure. The aspect and opinion words are expected to be closer along such tree structure compared to the standard dependency parse tree. The learning process allows the tree structure to adaptively correlate the aspect and opinion words, enabling us to better identify the polarity in the ABSA task. We conduct experiments on five aspect-based sentiment datasets, and the proposed model significantly outperforms recent strong baselines. Furthermore, our thorough analysis demonstrates the average distance between aspect and opinion words are shortened by at least 19% on the standard SemEval Restaurant14 (CITATION) dataset.
Understanding the scene around the ego-vehicle is key to assisted and autonomous driving. Nowadays, this is mostly conducted using cameras and laser scanners, despite their reduced performance in adverse weather … Understanding the scene around the ego-vehicle is key to assisted and autonomous driving. Nowadays, this is mostly conducted using cameras and laser scanners, despite their reduced performance in adverse weather conditions. Automotive radars are low-cost active sensors that measure properties of surrounding objects, including their relative speed, and have the key advantage of not being impacted by rain, snow or fog. However, they are seldom used for scene understanding due to the size and complexity of radar raw data and the lack of annotated datasets. Fortunately, recent open-sourced datasets have opened up research on classification, object detection and semantic segmentation with raw radar signals using end-to-end trainable models. In this work, we propose several novel architectures, and their associated losses, which analyse multiple "views" of the range-angle-Doppler radar tensor to segment it semantically. Experiments conducted on the recent CARRADA dataset demonstrate that our best model outperforms alternative models, derived either from the semantic segmentation of natural images or from radar scene understanding, while requiring significantly fewer parameters. Both our code and trained models are available at https://github.com/valeoai/MVRSS.
A new automotive radar data set with measurements and point-wise annotations from more than four hours of driving is presented. Data provided by four series radar sensors mounted on one … A new automotive radar data set with measurements and point-wise annotations from more than four hours of driving is presented. Data provided by four series radar sensors mounted on one test vehicle were recorded and the individual detections of dynamic objects were manually grouped to clusters and labeled afterwards. The purpose of this data set is to enable the development of novel (machine learning-based) radar perception algorithms with the focus on moving road users. Images of the recorded sequences were captured using a documentary camera. For the evaluation of future object detection and classification algorithms, proposals for score calculation are made so that researchers can evaluate their algorithms on a common basis. Additional information as well as download instructions can be found on the website of the data set: www.radar-scenes.com.
As an important fine-grained sentiment analysis problem, aspect-based sentiment analysis (ABSA), aiming to analyze and understand people's opinions at the aspect level, has been attracting considerable interest in the last … As an important fine-grained sentiment analysis problem, aspect-based sentiment analysis (ABSA), aiming to analyze and understand people's opinions at the aspect level, has been attracting considerable interest in the last decade. To handle ABSA in different scenarios, various tasks are introduced for analyzing different sentiment elements and their relations, including the aspect term, aspect category, opinion term, and sentiment polarity. Unlike early ABSA works focusing on a single sentiment element, many compound ABSA tasks involving multiple elements have been studied in recent years for capturing more complete aspect-level sentiment information. However, a systematic review of various ABSA tasks and their corresponding solutions is still lacking, which we aim to fill in this survey. More specifically, we provide a new taxonomy for ABSA which organizes existing studies from the axes of concerned sentiment elements, with an emphasis on recent advances of compound ABSA tasks. From the perspective of solutions, we summarize the utilization of pre-trained language models for ABSA, which improved the performance of ABSA to a new stage. Besides, techniques for building more practical ABSA systems in cross-domain/lingual scenarios are discussed. Finally, we review some emerging topics and discuss some open challenges to outlook potential future directions of ABSA.