Abstract
Fairness measurement is crucial for assessing algorithmic bias in various types of machine learning (ML) models, including ones used for search relevance, recommendation, personalization, talent analytics, and natural language processing. However, the fairness measurement paradigm is currently dominated by fairness metrics that examine disparities in allocation and/or prediction error as univariate key performance indicators (KPIs) for a protected attribute or group. Although important and effective in assessing ML bias in certain contexts such as recidivism, existing metrics don’t work well in many real-world applications of ML characterized by imperfect models applied to an array of instances encompassing a multivariate mixture of protected attributes, that are part of a broader process pipeline. Consequently, the upstream representational harm quantified by existing metrics based on how the model represents protected groups doesn’t necessarily relate to allocational harm in the application of such models in downstream policy/decision contexts. We propose FAIR-Frame, a model-based framework for parsimoniously modeling fairness across multiple protected attributes in regard to the representational and allocational harm associated with the upstream design/development and downstream usage of ML models. We evaluate the efficacy of our proposed framework on two testbeds pertaining to text classification using pretrained language models. The upstream testbeds encompass over fifty thousand documents associated with twenty-eight thousand users, seven protected attributes and five different classification tasks. The downstream testbeds span three policy outcomes and over 5.41 million total observations. Results in comparison with several existing metrics show that the upstream representational harm measures produced by FAIR-Frame and other metrics are significantly different from one another, and that FAIR-Frame’s representational fairness measures have the highest percentage alignment and lowest error with allocational harm observed in downstream applications. Our findings have important implications for various ML contexts, including information retrieval, user modeling, digital platforms, and text classification, where responsible and trustworthy AI is becoming an imperative.
- [1] . 2008. Sentiment analysis in multiple languages: Feature selection for opinion classification in web forums. ACM Transactions on Information Systems 26, 3 (2008), 1–34.Google ScholarDigital Library
- [2] . 2023. Data science for social good. Journal of the Association for Information Systems 24, 6 (2023), 1439–1458.Google ScholarCross Ref
- [3] . 2021. Constructing a psychometric testbed for fair natural language processing. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 3748–3758.Google ScholarCross Ref
- [4] . 2010. Selecting attributes for sentiment classification using feature relation networks. IEEE Transactions on Knowledge and Data Engineering 23, 3 (2010), 447–462.Google ScholarDigital Library
- [5] A. Abbasi, J. Li, G. Clifford, and H. Taylor. 2018. Make ‘fairness by design’ part of machine learning. Harvard Business Review, August 1. https://hbr.org/2018/08/makefairness-by-design-part-of-machine-learningGoogle Scholar
- [6] . 2016. Big data research in information systems: Toward an inclusive research agenda. Journal of the Association for Information Systems 17, 2 (2016), 3.Google ScholarCross Ref
- [7] . 2018. Prediction Machines: The Simple Economics of Artificial Intelligence. Harvard Business Press.Google Scholar
- [8] . 2022. Deep learning for adverse event detection from web search. IEEE Transactions on Knowledge and Data Engineering 34, 6 (2022), 2681–2695.Google Scholar
- [9] . 2020. A deep learning architecture for psychometric natural language processing. ACM Transactions on Information Systems 38, 1 (2020), 1–29.Google ScholarDigital Library
- [10] . 1973. Information theory and an extension of the maximum likelihood principle. In Proceedings of the 2nd International Symposium on Information Theory, 1973. Akademiai Kiado.Google Scholar
- [11] . 2019. The effects of working memory, perceptual speed, and inhibition in aggregated search. ACM Transactions on Information Systems 37, 3 (2019), 1–34.Google ScholarDigital Library
- [12] . 2020. Exploring Fairness in Machine Learning for International Development.
Technical Report . CITE MIT D-Lab.Google Scholar - [13] . 2017. The problem with bias: Allocative versus representational harms in machine learning. In Proceedings of the 9th Annual Conference of the Special Interest Group for Computing, Information and Society.Google Scholar
- [14] . 2016. Big data’s disparate impact. California Law Review 104, 3 (2016), 671–732.Google Scholar
- [15] . 2021. On the dangers of stochastic parrots: Can language models be too big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 610–623.Google ScholarDigital Library
- [16] James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13 (2012), 281–305.Google Scholar
- [17] . 2022. Fair risk algorithms. Annual Review of Statistics and Its Application 10 (2022), 165–187.Google Scholar
- [18] Michael L. Bernauer. 2017. Mlbernauer/drugstandards: Python library for standardizing drug names (v0.1). Zenodo. Google ScholarCross Ref
- [19] S. L. Blodgett, S. Barocas, H. Daumé III, and H. Wallach. 2020. Language (Technology) is power: A critical survey of “Bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5454–5476.Google Scholar
- [20] . 2016. Demographic dialectal variation in social media: A case study of african-american english. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1119–1130.Google ScholarCross Ref
- [21] . 2011. Structural equation models and the quantification of behavior. Proceedings of the National Academy of Sciences 108, supplement_3 (2011), 15639–15646.Google ScholarCross Ref
- [22] . 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in Neural Information Processing Systems 29 (2016).Google Scholar
- [23] Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, S. Buch, Dallas Card, Rodrigo Castellon, Niladri S. Chatterji, Annie S. Chen, Kathleen A. Creel, Jared Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren E. Gillespie, Karan Goel, Noah D. Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas F. Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, O. Khattab, Pang Wei Koh, Mark S. Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir P. Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Benjamin Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, J. F. Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, Aditi Raghunathan, Robert Reich, Hongyu Ren, Frieda Rong, Yusuf H. Roohani, Camilo Ruiz, Jack Ryan, Christopher R’e, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishna Parasuram Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei A. Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, and Percy Liang. 2021. On the opportunities and risks of foundation models. ArXiv (2021). Retrieved from https://crfm.stanford.edu/assets/report.pdfGoogle Scholar
- [24] . 2019. Compositional fairness constraints for graph embeddings. In Proceedings of the International Conference on Machine Learning. PMLR, 715–724.Google Scholar
- [25] . 2017. Fair pipelines. In Proceedings of the Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML).Google Scholar
- [26] . 2015. Predictive analytics: Predictive modeling at the micro level. IEEE Intelligent Systems 30, 3 (2015), 6–8.Google ScholarDigital Library
- [27] . 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the Conference on Fairness, Accountability and Transparency. PMLR, 77–91.Google Scholar
- [28] Robin Burke. 2017. Multisided fairness for recommendation. 2017 Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML’17).Google Scholar
- [29] . 2018. Stimulating online reviews by combining financial incentives and social norms. Management Science 64, 5 (2018), 2065–2082.Google ScholarDigital Library
- [30] . 2019. FairVis: Visual analytics for discovering intersectional bias in machine learning. In Proceedings of the 2019 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, 46–56.Google ScholarCross Ref
- [31] . 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183–186.Google ScholarCross Ref
- [32] . 2020. Accuracy comparison across face recognition algorithms: Where are we on measuring race bias? IEEE Transactions on Biometrics, Behavior, and Identity Science 3, 1 (2020), 101–111.Google ScholarCross Ref
- [33] . 2022. Historical representations of social groups across 200 years of word embeddings from Google Books. Proceedings of the National Academy of Sciences 119, 28 (2022), e2121798119.Google ScholarCross Ref
- [34] Jiawei Chen, Hande Dong, Xiang Wang, Fuli Feng, Meng Wang, and Xiangnan He. 2023. Bias and debias in recommender system: A survey and future directions. ACM Trans. Inf. Syst. 41, 3, Article 67 (July 2023), 39 pages. Google ScholarDigital Library
- [35] . 2020. A snapshot of the frontiers of fairness in machine learning. Communications of the ACM 63, 5 (2020), 82–89.Google ScholarDigital Library
- [36] . 2019. Bias in bios: A case study of semantic representation bias in a high-stakes setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 120–128.Google ScholarDigital Library
- [37] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
DOI: Google ScholarCross Ref - [38] . 2018. The accuracy, fairness, and limits of predicting recidivism. Science Advances 4, 1 (2018), eaao5580.Google ScholarCross Ref
- [39] . 2019. Neural architecture search: A survey. The Journal of Machine Learning Research 20, 1 (2019), 1997–2017.Google ScholarDigital Library
- [40] . 2019. The price of local fairness in multistage selection. In Proceedings of the IJCAI-2019-28th International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 5836–5842.Google ScholarCross Ref
- [41] . 2019. A comparative study of fairness-enhancing interventions in machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 329–338.Google ScholarDigital Library
- [42] . 1996. Bias in computer systems. ACM Transactions on Information Systems 14, 3 (1996), 330–347.Google ScholarDigital Library
- [43] . 2012. Sentimental spidering: Leveraging opinion information in focused crawlers. ACM Transactions on Information Systems 30, 4 (2012), 1–30.Google ScholarDigital Library
- [44] . 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115, 16 (2018), E3635–E3644.Google ScholarCross Ref
- [45] . 2021. Intrinsic bias metrics do not correlate with application bias. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1926–1940.Google ScholarCross Ref
- [46] . 2022. Auto-debias: Debiasing masked language models with automated biased prompts. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1012–1023.Google ScholarCross Ref
- [47] . 2021. Graph technologies for user modeling and recommendation: Introduction to the special issue - part 1. ACM Transactions on Information Systems 40, 2, Article
21 (2021), 5 pages.DOI: Google ScholarDigital Library - [48] . 2021. Introduction to the special section on graph technologies for user modeling and recommendation, part 2. ACM Transactions on Information Systems 40, 3, Article
42 (2021), 5 pages.DOI: Google ScholarDigital Library - [49] . 2021. AutoML: A survey of the state-of-the-art. Knowledge-Based Systems 212 (2021), 106622.Google ScholarCross Ref
- [50] Jake M. Hofman, Duncan J. Watts, Susan Athey, Filiz Garip, Thomas L. Griffiths, Jon Kleinberg, Helen Margetts, Sendhil Mullainathan, Matthew J. Salganik, Simine Vazire, Alessandro Vespignani, and Tal Yarkoni. 2021. Integrating explanation and prediction in computational social science. Nature 595, 7866 (2021), 181–188.Google ScholarCross Ref
- [51] . 2019. Gender-preserving debiasing for pre-trained word embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1641–1650.Google ScholarCross Ref
- [52] . 2021. Debiasing pre-trained contextualised embeddings. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 1256–1266.Google ScholarCross Ref
- [53] . 2018. Examining gender and race bias in two hundred sentiment analysis systems. In Proceedings of the 7th Joint Conference on Lexical and Computational Semantics. Association for Computational Linguistics, New Orleans, Louisiana, 43–53.
DOI: Google ScholarCross Ref - [54] . 2018. Advanced customer analytics: Strategic value through integration of relationship-oriented big data. Journal of Management Information Systems 35, 2 (2018), 540–574.Google ScholarCross Ref
- [55] . 2015. Prediction policy problems. American Economic Review 105, 5 (2015), 491–495.Google ScholarCross Ref
- [56] . 2020. Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences 117, 14 (2020), 7684–7689.Google ScholarCross Ref
- [57] . 2022. Benchmarking intersectional biases in NLP. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 3598–3609.
DOI: Google ScholarCross Ref - [58] . 2019. Procedural justice in algorithmic fairness: Leveraging transparency and outcome control for fair algorithmic mediation. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–26.Google ScholarDigital Library
- [59] . 2023. Trustworthy ai: From principles to practices. ACM Computing Surveys 55, 9 (2023), 1–46.Google ScholarDigital Library
- [60] . 2016. Normalising medical concepts in social media texts by learning semantic representation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1014–1023.Google ScholarCross Ref
- [61] . 2000. Medical subject headings (MeSH). Bulletin of the Medical Library Association 88, 3 (2000), 265.Google Scholar
- [62] Dugang Liu, Pengxiang Cheng, Zinan Lin, Xiaolian Zhang, Zhenhua Dong, Rui Zhang, Xiuqiang He, Weike Pan, and Zhong Ming. 2023. Bounding system-induced biases in recommender systems with a randomized dataset. ACM Trans. Inf. Syst. 41, 4, Article 108 (October 2023), 26 pages. Google ScholarDigital Library
- [63] . 2019. Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692. Retrieved from https://arxiv.org/abs/1907.11692Google Scholar
- [64] Zhongzhou Liu, Yuan Fang, and Min Wu. 2023. Mitigating popularity bias for users and items with fairness-centric adaptive recommendation. ACM Trans. Inf. Syst. 41, 3, Article 55 (July 2023), 27 pages. Google ScholarDigital Library
- [65] . 2022. De-biasing “bias” measurement. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 379–389.Google ScholarDigital Library
- [66] . 2022. Learning fair node representations with graph counterfactual fairness. In Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 695–703.Google ScholarDigital Library
- [67] . 2018. Learning adversarially fair and transferable representations. In Proceedings of the International Conference on Machine Learning. PMLR, 3384–3393.Google Scholar
- [68] . 2022. A graph-based approach for mitigating multi-sided exposure bias in recommender systems. ACM Transactions on Information Systems 40, 2, Article
32 (2022), 31 pages.DOI: Google ScholarDigital Library - [69] . 2021. A survey on bias and fairness in machine learning. ACM Computing Surveys 54, 6 (2021), 1–35.Google ScholarDigital Library
- [70] . 2013. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 26 (2013).Google Scholar
- [71] L. Morse, M. H. M. Teodorescu, Y. Awwad, et al. 2022. Do the ends justify the means? Variation in the distributive and procedural fairness of machine learning algorithms. J. Bus Ethics 181 (2022), 1083–1095. Google ScholarCross Ref
- [72] . 2018. Translation tutorial: 21 fairness definitions and their politics. In Proceedings of the Conference on Fairness, Accountability, and Transparency, New York, USA. 3.Google Scholar
- [73] . 2020. Health literacy, health numeracy, and trust in doctor: Effects on key patient health outcomes. Journal of Consumer Affairs 54, 1 (2020), 3–42.Google ScholarCross Ref
- [74] A. Ng. 2011. Advice for applying machine learning. Stanford Univ., Stanford, CA, USA, Tech. Rep., 2011. [Online]. Available: http://cs229.stanford.edu/materials/ML-advice.pdfGoogle Scholar
- [75] Harrie Oosterhuis. 2023. Doubly robust estimation for correcting position bias in click feedback for unbiased learning to rank. ACM Trans. Inf. Syst. 41, 3, Article 61 (July 2023), 33 pages. Google ScholarDigital Library
- [76] . 2012. Exploring question selection bias to identify experts and potential experts in community question answering. ACM Transactions on Information Systems 30, 2, Article
10 (2012), 28 pages.DOI: Google ScholarDigital Library - [77] . 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1532–1543.Google ScholarCross Ref
- [78] . 2022. A review on fairness in machine learning. ACM Computing Surveys 55, 3 (2022), 1–44.Google ScholarDigital Library
- [79] . 2013. Data Science for Business: What You Need to Know About Data Mining and Data-analytic Thinking. O’Reilly Media, Inc.Google Scholar
- [80] . 2021. Where responsible AI meets reality: Practitioner perspectives on enablers for shifting organizational practices. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–23.Google ScholarDigital Library
- [81] . 2021. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Computing Surveys 54, 4 (2021), 1–34.Google ScholarDigital Library
- [82] . 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.Google ScholarCross Ref
- [83] Tetsuya Sakai, Jin Young Kim, and Inho Kang. 2023. A versatile framework for evaluating ranked lists in terms of group fairness and relevance. ACM Trans. Inf. Syst. 42, 1, Article 11 (January 2024), 36 pages. Google ScholarDigital Library
- [84] . 2019. The risk of racial bias in hate speech detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1668–1678.Google ScholarCross Ref
- [85] . 2020. Predictive biases in natural language processing models: A conceptual framework and overview. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5248–5264.
DOI: Google ScholarCross Ref - [86] Galit Shmueli. 2010. To explain or to predict? Statist. Sci. 25, 3 (2010), 289–310. Google ScholarCross Ref
- [87] Galit Shmueli and Otto Koppius. 2011. Predictive analytics in information systems research. Management Information Systems Quarterly 35, 3 (2011), 553–572.Google Scholar
- [88] Herbert A. Simon. 1998. The science of design: Creating the artificial. Design Issues 4, 1/2 (1988), 67–82. Google ScholarCross Ref
- [89] Sriram Somanchi, Ahmed Abbasi, Ken Kelley, David Dobolyi, and Ted Tao Yuan. 2023. Examining user heterogeneity in digital experiments. ACM Trans. Inf. Syst. 41, 4, Article 100 (October 2023), 34 pages. Google ScholarDigital Library
- [90] . 2022. Upstream mitigation is not all you need: Testing the bias transfer hypothesis in pre-trained language models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 3524–3542.Google ScholarCross Ref
- [91] . 2021. Evaluating debiasing techniques for intersectional biases. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2492–2498.Google ScholarCross Ref
- [92] . 2019. Assessing social and intersectional biases in contextualized word representations. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
- [93] . 2010. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology 29, 1 (2010), 24–54.Google ScholarCross Ref
- [94] Mike H. M. Teodorescu, et al. 2021. Failures of fairness in automation require a deeper understanding of human-ML augmentation. Management Information Systems Quarterly 45, 3 (2021), 1483–1500.Google Scholar
- [95] . 2022. Understanding the “Pathway” towards a searcher’s learning objective. ACM Transactions on Information Systems 40, 4 (2022), 1–43.Google ScholarDigital Library
- [96] Yifan Wang, Weizhi Ma, Min Zhang, Yiqun Liu, and Shaoping Ma. 2023. A survey on the fairness of recommender systems. ACM Trans. Inf. Syst. 41, 3, Article 52 (July 2023), 43 pages. Google ScholarDigital Library
- [97] . 2009. Cyberchondria: Studies of the escalation of medical concerns in web search. ACM Transactions on Information Systems 27, 4 (2009), 1–37.Google ScholarDigital Library
- [98] Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. 2020. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 38–45.Google ScholarCross Ref
- [99] J. M. Wooldridge. 2009. Omitted variable bias: the simple case. Introductory Econometrics: A Modern Approach. Mason, OH: Cengage Learning. 89–93.Google Scholar
- [100] . 2023. A multi-objective optimization framework for multi-stakeholder fairness-aware recommendation. ACM Transactions on Information Systems 41, 2, Article
47 (2023), 29 pages.DOI: Google ScholarDigital Library - [101] . 2021. Learning fair representations for recommendation: A graph-based perspective. In Proceedings of the Web Conference 2021. 2198–2208.Google ScholarDigital Library
- [102] Heng Xu and Nan Zhang. 2022. Goal orientation for fair machine learning algorithms (December 12, 2022). Available at SSRN: https://ssrn.com/abstract=4300581Google Scholar
- [103] . 2020. Fairness with overlapping groups; A probabilistic perspective. Advances in Neural Information Processing Systems 33 (2020), 4067–4078.Google Scholar
- [104] . 2023. Getting personal: A deep learning artifact for text-based measurement of personality. Information Systems Research 34, 1 (2023), 194–222.Google ScholarDigital Library
- [105] . 2017. Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science 12, 6 (2017), 1100–1122.Google ScholarCross Ref
- [106] Han Zhang, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen. 2023. Contrastive learning for legal judgment prediction. ACM Trans. Inf. Syst. 41, 4, Article 113 (October 2023), 25 pages. Google ScholarDigital Library
- [107] N. Zhang and H. Xu. 2024. Fairness of ratemaking for catastrophe insurance: Lessons from machine learning. Information Systems Research, Forthcoming.Google Scholar
- [108] Z. Zhao et al. 2023. Popularity bias is not always evil: Disentangling benign and harmful bias for recommendation. In IEEE Transactions on Knowledge and Data Engineering, 35, 10 (2023), 9920–9931, 1 Oct. 2023.
DOI: Google ScholarDigital Library - [109] . 2023. Causal-debias: Unifying debiasing in pretrained language models and fine-tuning via causal invariant learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Toronto, Canada, 4227–4241. Retrieved from https://aclanthology.org/2023.acl-long.232Google ScholarCross Ref
Index Terms
- Should Fairness be a Metric or a Model? A Model-based Framework for Assessing Bias in Machine Learning Pipelines
Recommendations
Fairness-Aware Machine Learning: Practical Challenges and Lessons Learned
WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data MiningResearchers and practitioners from different disciplines have highlighted the ethical and legal challenges posed by the use of machine learned models and data-driven systems, and the potential for such systems to discriminate against certain population ...
Silva: Interactively Assessing Machine Learning Fairness Using Causality
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing SystemsMachine learning models risk encoding unfairness on the part of their developers or data sources. However, assessing fairness is challenging as analysts might misidentify sources of bias, fail to notice them, or misapply metrics. In this paper we ...
Framework for Bias Detection in Machine Learning Models: A Fairness Approach
WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data MiningThe research addresses bias and inequity in binary classification problems in machine learning. Despite existing ethical frameworks for artificial intelligence, detailed guidance on practices and tech niques to address these issues is lacking. The main ...
Comments