This paper delves into the critical issue of fairness in dimensionality reduction methods, specifically focusing on Principal Component Analysis (PCA) within recommender systems. It argues that while PCA is a fundamental technique for collaborative filtering, its inherent mechanisms can introduce significant “unfairness” at the item level, leading to suboptimal recommendations, especially for niche or highly popular content.
The core insight is the identification of two underlying popularity-driven mechanisms that cause this unfairness:
- Underserving Less Popular Items: Less popular items often rely on the trailing principal components for accurate representation. Since PCA’s low-rank approximation inherently discards these components, unpopular items are poorly represented, leading to fewer or lower-quality recommendations for them. This means users with niche tastes might not find relevant content.
- Specialization of Popular Items: For highly popular items, the leading PCA components tend to specialize in capturing information about individual popular items rather than their collaborative similarities with other items. This breaks the “collaborative” aspect that is central to good recommendations, as the system struggles to infer preferences for popular items based on a user’s interaction with other items. This can lead to popular items being recommended in a non-collaborative, almost siloed, manner.
To address these issues, the paper proposes a novel algorithm called Item-Weighted PCA (IWPCA). The key innovations of IWPCA are:
- Weighted Objective: IWPCA introduces a flexible weighting scheme that up-weights less popular items in its objective function. This encourages the model to retain more information about these items, counteracting the first unfairness mechanism. The weighting is controlled by a parameter
y
(e.g., popularity^y
), allowing for flexible interpolation.
- Convex Formulation with Hard Rank Constraint: Unlike many previous weighted dimensionality reduction techniques that often lead to non-convex optimization problems or struggle to enforce strict rank constraints, IWPCA is formulated as a convex Semi-Definite Program (SDP). This guarantees a globally optimal solution and explicitly maintains the low-rank structure crucial for efficiency and interpretability in recommendations. The algorithm is shown to be solvable in polynomial time.
- Theoretical Unification: The paper provides a significant theoretical contribution by demonstrating that both standard PCA (referred to as Vanilla PCA) and Normalized PCA (a common baseline that normalizes item vectors before PCA) are specific instances of IWPCA under certain conditions (
y=0
for Vanilla PCA and y=-2
for Normalized PCA in specific matrix structures). This positions IWPCA as a generalized framework that can interpolate between these existing methods, balancing their respective strengths and weaknesses.
Main Prior Ingredients:
The work builds upon established foundations in:
- Principal Component Analysis (PCA): Leveraging its mathematical framework for dimensionality reduction, singular value decomposition, and low-rank approximation.
- Collaborative Filtering and Recommender Systems: Applying PCA within the context of user-item preference matrices and the goal of predicting user interest for unseen items.
- Algorithmic Fairness: While prior work on fair PCA primarily focused on user-level fairness (e.g., balancing errors across demographic groups or obscuring sensitive attributes), this paper shifts the focus to item-level fairness and the internal mechanisms of the algorithm itself, rather than external fairness constraints.
- Weighted Matrix Factorization/Dimensionality Reduction: Drawing on the concept of weighting elements in matrix decomposition, but innovating by ensuring convexity and a hard rank constraint, which were often challenges in prior weighted methods.
- Popularity Bias in Recommender Systems: Addressing the well-known phenomenon of popular items being over-recommended, but tackling it at the level of the item’s representation within the model, rather than just post-processing recommendation lists or handling missing data.
- Convex Optimization and Semi-Definite Programming (SDP): Utilizing the power of SDP to formulate a complex objective in a solvable manner, ensuring optimality guarantees.
Empirical evaluations on real-world datasets demonstrate that IWPCA effectively mitigates the identified unfairness mechanisms, leading to reduced specialization and increased collaboration among items, while simultaneously producing superior recommendation quality compared to PCA baselines. This work offers a deeper, mechanistic understanding of fairness issues in fundamental ML algorithms and provides a principled, effective solution.