当前位置: X-MOL 学术EURASIP J. Audio Speech Music Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Direction-of-arrival and power spectral density estimation using a single directional microphone and group-sparse optimization
EURASIP Journal on Audio, Speech, and Music Processing ( IF 2.4 ) Pub Date : 2023-10-04 , DOI: 10.1186/s13636-023-00304-8
Elisa Tengan , Thomas Dietzen , Filip Elvander , Toon van Waterschoot

In this paper, two approaches are proposed for estimating the direction of arrival (DOA) and power spectral density (PSD) of stationary point sources by using a single, rotating, directional microphone. These approaches are based on a method previously presented by the authors, in which point source DOAs were estimated by using a broadband signal model and solving a group-sparse optimization problem, where the number of observations made by the rotating directional microphone can be lower than the number of candidate DOAs in an angular grid. The DOA estimation is followed by the estimation of the sources’ PSDs through the solution of an overdetermined least squares problem. The first approach proposed in this paper includes the use of an additional nonnegativity constraint on the residual noise term when solving the group-sparse optimization problem and is referred to as the Group Lasso Least Squares (GL-LS) approach. The second proposed approach, in addition to the new nonnegativity constraint, employs a narrowband signal model when building the linear system of equations used for formulating the group-sparse optimization problem, where the DOAs and PSDs can be jointly estimated by iterative, group-wise reweighting. This is referred to as the Group-Lasso with $$l_1$$ -reweighting (GL-L1) approach. Both proposed approaches are implemented using the alternating direction method of multipliers (ADMM), and their performance is evaluated through simulations in which different setup conditions are considered, ranging from different types of model mismatch to variations in the acoustic scene and microphone directivity pattern. The results obtained show that in a scenario involving a microphone response mismatch between observed data and the signal model used, having the additional nonnegativity constraint on the residual noise can improve the DOA estimation for the case of GL-LS and the PSD estimation for the case of GL-L1. Moreover, the GL-L1 approach can present an advantage over GL-LS in terms of DOA estimation performance in scenarios with low SNR or where multiple sources are closely located to each other. Finally, it is shown that having the least squares PSD re-estimation step is beneficial in most scenarios, such that GL-LS outperformed GL-L1 in terms of PSD estimation errors.

中文翻译:

使用单向麦克风和组稀疏优化的到达方向和功率谱密度估计

本文提出了两种方法,通过使用单个旋转定向麦克风来估计固定点源的到达方向(DOA)和功率谱密度(PSD)。这些方法基于作者之前提出的方法,其中通过使用宽带信号模型并解决组稀疏优化问题来估计点源 DOA,其中旋转定向麦克风进行的观测数量可以低于角度网格中候选 DOA 的数量。DOA 估计之后是通过解决超定最小二乘问题来估计源的 PSD。本文提出的第一种方法包括在求解组稀疏优化问题时对残余噪声项使用额外的非负约束,称为组套索最小二乘法 (GL-LS)。除了新的非负性约束之外,第二种方法在构建用于制定组稀疏优化问题的线性方程组时还采用了窄带信号模型,其中 DOA 和 PSD 可以通过迭代、分组方式联合估计重新加权。这称为具有 $$l_1$$ 重新加权 (GL-L1) 方法的 Group-Lasso。两种提出的方​​法都是使用乘法器交替方向法(ADMM)来实现的,并且通过考虑不同设置条件的模拟来评估它们的性能,范围从不同类型的模型不匹配到声学场景和麦克风指向性模式的变化。获得的结果表明,在观测数据与所用信号模型之间存在麦克风响应不匹配的情况下,对残余噪声施加额外的非负约束可以改善 GL-LS 情况下的 DOA 估计和该情况下的 PSD 估计GL-L1。此外,在低信噪比或多个源彼此靠近的情况下,GL-L1 方法在 DOA 估计性能方面比 GL-LS 具有优势。最后,结果表明,最小二乘 PSD 重新估计步骤在大多数情况下都是有益的,因此 GL-LS 在 PSD 估计误差方面优于 GL-L1。
更新日期:2023-10-05
down
wechat
bug