一种融合因果干预的新闻反事实去偏方法

易锦成; 蒋少华; 文启鹏

doi:10.20009/j.cnki.21-1106/TP.2025-0206

小型微型计算机系统 ›› 2026, Vol. 47 ›› Issue (5) : 1147 -1155. DOI: 10.20009/j.cnki.21-1106/TP.2025-0206

算法理论与人工智能

一种融合因果干预的新闻反事实去偏方法

易锦成, 蒋少华, 文启鹏

作者信息 +

News Counterfactual De-bias Method Integrating Causal Intervention

YI Jincheng, JIANG Shaohua, WEN Qipeng

Author information +

文章历史 +

摘要

鉴于新闻偏见对公众认知、社会信任及公平性的深层影响,利用自然语言处理技术构建透明、可解释的去偏见框架,已成为传播学与人工智能的交叉研究热点.现有研究主要围绕两类偏见展开:词汇偏见和框架偏见.在词汇偏见方面,主流方法多通过词汇替换来消除文本中的显性偏见词,但仍存在中性词语中蕴含隐性立场倾向、上下文适应性差等问题;在框架偏见方面,现有研究多采用文本重构或多文本融合生成的方式来建模中立文本,但存在框架偏见不可观测、立场冲突难解耦、生成目标模糊等挑战,限制了偏见缓解效果的进一步提升.针对上述问题,本文提出一种融合因果干预与反事实推理的多阶段新闻偏见缓解方法.首先,针对词汇偏见,构建基于PMI的多立场偏见词典,并引入后门干预机制,通过语义相似度匹配进行词语替换,从而缓解显性偏见.其次,为应对结构性框架偏见的不可观测性,本文引入反事实推理方法,基于因果公式 TIE=TE-NDE 建模偏左与偏右框架对中立表达的影响,其中TE表示总偏见效应,NDE表示中立文本的自然直接效应,TIE则反映偏见传播的间接效应.最后,本文引入一个预训练的偏见检测器作为辅助监督模块,增强生成模型对文本中立性与专业性的建模能力.实验结果表明,本文方法在多个偏见缓解与文本质量评估指标上均显著优于现有主流方法,验证了该方法在多源新闻文本去偏任务中的有效性与实用价值.

Abstract

Given the profound impact of news bias on public perception,social trust,and fairness,building transparent and interpretable debiasing frameworks with natural language processing(NLP)has become a research focus at the intersection of communication studies and artificial intelligence.Existing work mainly targets lexical bias and framing bias.For lexical bias,mainstream approaches replace explicit biased words but struggle with implicit stance tendencies in neutral words and poor contextual adaptability.For framing bias,text reconstruction or multi-text fusion is often used,yet faces the unobservability of framing bias,difficulty in disentangling stance conflicts,and vague generation objectives,limiting further improvement.To address these issues,we propose a multi-stage news bias mitigation method combining causal intervention and counterfactual reasoning.A PMI-based multi-stance lexicon and a back-door intervention mechanism perform semantic similarity-based word replacement to reduce explicit bias.Counterfactual reasoning with TIE=TE-NDE models the influence of left-and right-leaning frames on neutral expressions,where TE is the total bias effect,NDE is the natural direct effect,and TIE captures indirect bias propagation.A pre-trained bias detector provides auxiliary supervision,enhancing the model′s ability to generate neutral and professional text.Experiments show our approach significantly outperforms mainstream methods across multiple debiasing and text quality metrics,confirming its effectiveness in multi-source news debiasing tasks.

关键词

新闻偏见 / 词汇偏见 / 框架偏见 / 因果干预 / 反事实推理

Key words

news bias / lexical bias / framing bias / causal inference / counterfactual reasoning

引用本文

引用格式 ▾

易锦成, 蒋少华, 文启鹏. 一种融合因果干预的新闻反事实去偏方法[J]. 小型微型计算机系统, 2026, 47(5): 1147-1155 DOI:10.20009/j.cnki.21-1106/TP.2025-0206

登录浏览全文

4963

注册一个新账户忘记密码

参考文献

[1] Cinelli M,De Francisci Morales G,Galezzi A,et al.The echo chamber effect on social media[J].Proceedings of the National Academy of Sciences of the United States of America,2021,118(9):e2023301118,doi:10.1073/pnas.2023301118.
[2] Wang Y.An analysis of the information cocoon effect of news clients:today′s headlines as an example[J].Frontiers of Society,Science and Technology,2023,5(9):7-11.
[3] Lee N,Bang Y,Yu T,et al.NeuS:neutral multi-news summarization for mitigating framing bias[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,2022:3131-3148.
[4] Ruan Q,Namee B M,Dong R.Reducing media bias in news headlines[C]//Proceedings of the 31st Irish Conference on Artificial Intelligence and Cognitive Science,2023:1-4.
[5] Kiesel J,Mestre M,Shukla R,et al.SemEval-2019 task 4:hyperpartisan news detection[C]//Proceedings of the 13th International Workshop on Semantic Evaluation,Minneapolis:Association for Computational Linguistics,2019:829-839.
[6] Pearl J.Causal inference in statistics:an overview[J].Statistics Surveys,2009:396-146,doi:10.1214/09-SS057.
[7] Wu A P,Kuang K,Xiong R X,et al.Instrumental variables in causal inference and machine learning:a survey[J].ACM Computing Surveys,2025,57(11):292,doi:10.1145/3735969.
[8] Pearl J,Glymour M,Jewell N P.Causal inference in statistics:a primer[M].Hoboken:Wiley,2016.
[9] Blumberg,Joyce C.Causal inference for statistics,social,and biomedical sciences[J].International Statistical Review,2016,84(1):159,doi:10.1111/insr.12170.
[10] Zhang Y,Feng F L,He X N,et al.Causal intervention for leveraging popularity bias in recommendation[C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval,2021:11-20.
[11] Wang W,Feng F,He X,et al.Clicks can be cheating:counterfactual recommendation for mitigating clickbait issue[C]//44th International ACM SIGIR Conference on Research and Development in Information Retrieval,2021,doi:10.1145/3404835.3462962.
[12] Tian B,Cao Y,Zhang Y,et al.Debiasing NLU models via causal intervention and counterfactual reasoning[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence,2022:11376-11384.
[13] Qian C,Feng F,Wen L,et al.Counterfactual inference for text classification debiasing[C]//59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing,2021,doi:10.18653/v1/2021.acl-long.422.
[14] Zhang W,Lin H,Han X,et al.De-biasing distantly supervised named entity recognition via causal intervention[J].arXiv preprint arXiv:2106.09233,2021.
[15] Li S B,Li X G,Shang L F,et al.How pre-trained language models capture factual knowledge?A causal-inspired analysis[C]//Findings of the Association for Computational Linguistics,2022:1720-1732.
[16] Zhu Y C,Sheng Q,Cao J,et al.Generalizing to the future:mitigating entity bias in fake news detection[C]//Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval,2022:2120-2125.
[17] Lei Y Y,Huang R H.Identifying conspiracy theories news based on event relation graph[C]//Findings of the Association for Computational Linguistics,2023:9811-9822.
[18] Sun T,Gaut A,Tang S,et al.Mitigating gender bias in natural language processing:literature review[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019,doi:10.18653/v1/P19-1159.
[19] Lei Y,Huang R,Wang L,et al.Sentence-level media bias analysis informed by discourse structures[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing(EMNLP),2022:10040-10050.
[20] GU Y R,XUE Y C,ZHANG T F.ID4TST:text style transfer model based on fused datasets[J].Journal of Chinese Computer Systems,2024,45(10):2338-2344.
[21] Manzini T,Lim Y C,Tsvetkov Y,et al.Black is to criminal as caucasian is to police:detecting and removing multiclass bias in word embeddings[J].arXiv:1904.04047v3,2019.
[22] Bordia S,Bowman S R.Identifying and reducing gender bias in word-level language models[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics,2019:7-15.
[23] Pryzant R,Martinez R D,Dass N,et al.Automatically neutralizing subjective bias in text[C]//Association for the Advancement of Artificial Intelligence,2020:526-534.
[24] Madanagopal K,Caverlee J.Reinforced sequence training based subjective bias correction[C]//Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics,2023:1234-1245.
[25] Liu R,Jia C,Vosoughi S.A transformer-based framework for neutralizing and reversing the political polarity of news articles[C]//Proceedings of the ACM on Human-Computer Interaction,2021:1-26.
[26] WANG J,ZHANG Y,YU Z T,et al.A Chinese-vietnamese cross-language summary generation method using word alignment semi-supervised adversarial learning[J].Journal of Chinese Computer Systems,2022,43(5):992-997.
[27] Chen Z,Hu L,Li W,et al.Causal intervention and counterfactual reasoning for multi-modal fake news detection[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics,2023:1234-1245.
[28] YANG X,LI F,XIANG L Y.Synonym replacement steganography algorithm based on matrix coding[J].Journal of Chinese Computer Systems,2015,36(6):1296-1300.
[29] Lewis M,Liu Y,Goyal N,et al.BART:denoising sequence-to-sequence pre-training for natural language generation,translation,and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,2020:7871-7880.
[30] Lei Y,Song K,Cho S,et al.Polarity calibration for opinion summarization[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,2024:5211-5224.
[31] Baly R,Martino G D S,Glass J,et al.We can detect your bias:predicting the political ideology of news articles[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing(EMNLP),2020:4982-4991.
[32] Radev D R.LexRank:graph-based lexical centrality as salience in text summarization[J].Journal of Qiqihar Junior Teachers College,2004,22:457-479,doi:10.1613/jair.1523.
[33] Zhang J,Zhao Y,Saleh M,et al.PEGASUS:pre-training with extracted gap-sentences for abstractive summarization[C]//International Conference on Machine Learning,2020:11328-11339.
[34] Fabbri A,Li I,She T,et al.Multi-news:a large-scale multi-document summarization dataset and abstractive hierarchical model[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019,doi:10.18653/v1/P19-1102.

附中文参考文献:
[20] 顾亦然,薛宇辰,张腾飞.ID4TST:基于融合数据集的文本风格迁移模型[J].小型微型计算机系统,2024,45(10):2338-2344.
[26] 王剑,张莹,余正涛,等.使用词对齐半监督对抗学习的汉越跨语言摘要生成方法[J].小型微型计算机系统,2022,43(5):992-997.
[28] 杨潇,李峰,向凌云.基于矩阵编码的同义词替换隐写算法[J].小型微型计算机系统,2015,36(6):1296-1300.