Sponsors
Key Dates
Submissions Open on OpenReview | August 20, 2024 |
Submission Deadline | September 14, 2024, AoE |
Acceptance Notification | October 9, 2024, AoE |
Camera-Ready Deadline | December 1, 2024, AoE |
Workshop Date | December 14, 2024 |
All deadlines are specified in AoE (Anywhere on Earth).
Description/Call For Papers
The Socially Responsible Language Modelling Research (SoLaR) workshop at NeurIPS 2024 is an interdisciplinary gathering that aims to foster responsible and ethical research in the field of language modeling. Recognizing the significant risks and harms [33-37] associated with the development, deployment, and use of language models, the workshop emphasizes the need for researchers to focus on addressing these risks starting from the early stages of development. The workshop brings together experts and practitioners from various domains and academic fields with a shared commitment to promoting fairness, equity, accountability, transparency, and safety in language modeling research.
Given the wide-ranging impacts of LMs, our workshop will welcome a broad array of submissions. We briefly detail some specific topic areas and an illustrative selection of pertinent works:- Security and privacy concerns of LMs [13, 30, 25, 49, 55].
- Bias and exclusion in LMs [12, 2, 26, 53, 44].
- Analysis of the development and deployment of LMs, including crowdwork [42, 50], deploy- ment protocols [52, 47], and societal impacts from deployment [10, 21].
- Safety, robustness, and alignment of LMs [51, 8, 35, 32, 7].
- Auditing, red-teaming, and evaluations of LMs [41, 40, 29, 15, 11].
- Examination of risks and harms from any novel input and/or output modalities that are introduced in LMs [14, 28, 54].
- Transparency, explainability, interpretability of LMs [39, 17, 3, 46, 22, 38].
- Applications of LMs for social good, including sector-specific applications [9, 31, 16] and LMs for low-resource languages [4, 5, 36].
- Perspectives from other domains that can inform socially responsible LM development and deployment [48, 1].
- Studies on economic impacts of LMs, e.g., labor-market disruptions [18, 34].
- Risk assessment [33, 24, 37, 23].
- Regulation and governance of LMs [45, 6, 27].
- Philosophical examination of concepts related to alignment, safety [19, 43, 20].
References
[1] The Grey Hoodie Project: Big Tobacco, Big Tech, and the Threat on Academic Integrity. In AIES 2021.
[2] Persistent Anti-Muslim Bias in Large Language Models. In AIES 2021.
[3] Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation. In ICLR 2022.
[4] A Few Thousand Translations Goa Long Way! Leveraging Pre-trained Models for African News Translation. In NAACL 2022.
[5] MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition. In EMNLP 2022.
[6] Managing Emerging Risks to Public Safety, Sept. 2023. URL http://arxiv.org/abs/2307.03718.
[7] Foundational challenges in assuring alignment and safety of large
language models. arXiv preprint arXiv:2404.09932, 2024.
[8] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human
Feedback, Apr. 2022. URL http://arxiv.org/abs/2204.05862. arXiv:2204.05862 [cs]
[9] Fine-tuning language models to find agreement among humans with diverse preferences. In A. H. Oh,
A. Agarwal, D. Belgrave, and K. Cho, editors, Advances in Neural Information Processing
Systems, 2022.
[10] On the Dangers of Stochastic
Parrots: Can Language Models Be Too Big? In Proceedings of the 2021 ACM Conference on
Fairness, Accountability, and Transparency, FAccT ’21.
[11] Ai auditing: The broken bus
on the road to ai accountability. arXiv preprint arXiv:2401.14462, 2024
[12] Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets. In Proceedings of the 59th
Annual Meeting of the Association for Computational Linguistics and the 11th International
Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1004–1015,
Online, Aug. 2021.
[13] What Does it Mean for a
Language Model to Preserve Privacy? In 2022 ACM Conference on Fairness, Accountability,
and Transparency. ACM, June 2022.
[14] Are aligned neural networks adversarially aligned? Advances in
Neural Information Processing Systems, 36, 2023.
[15] Black-box access is insufficient for rigorous ai audits. arXiv
preprint arXiv:2401.14446, 2024.
[16] Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings.
In Proceedings of the 2019 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Pa-
pers), pages 2970–3005, Minneapolis, Minnesota, June 2019
[17] Towards A Rigorous Science of Interpretable Machine Learning,
Mar. 2017. URL http://arxiv.org/abs/1702.08608.
[18] GPTs are GPTs: An Early Look at the
Labor Market Impact Potential of Large Language Models, Aug. 2023. arXiv: 2303.10130
[19] Artificial intelligence, values, and alignment. Minds and machines, 30(3):411–437,
2020. Publisher: Springer.
[20] he ethics of advanced AI assistants. arXiv preprint
arXiv:2404.16244, 2024.
[21] Predictability and Surprise in Large Generative Models. In 2022 ACM Conference on
Fairness, Accountability, and Transparency, FAccT ’22, pages 1747–1764, New York, NY,
USA, June 2022. Association for Computing Machinery.
[22] Datasheets for datasets. Communications of the ACM, 64(12):86–92, Dec. 2021.
[23] The false promise of risk assessments. In Proceedings of the 2020 Conference on
Fairness, Accountability, and Transparency. ACM, Jan. 2020.
[24] Algorithmic Risk Assessments Can Alter Human Decision-Making
Processes in High-Stakes Government Contexts. Proceedings of the ACM on Human-Computer
Interaction, 5(CSCW2):418:1–418:33, Oct. 2021.
[25] Predictability and Surprise in Large Generative Models. In 2022 ACM Conference on
Fairness, Accountability, and Transparency, FAccT ’22, pages 1747–1764, New York, NY,
USA, June 2022.
[26] Datasheets for datasets. Communications of the ACM, 64(12):86–92, Dec. 2021
[27] The false promise of risk assessments. In Proceedings of the 2020 Conference on
Fairness, Accountability, and Transparency. ACM, Jan. 2020.
[28] Algorithmic Risk Assessments Can Alter Human Decision-Making
Processes in High-Stakes Government Contexts. Proceedings of the ACM on Human-Computer
Interaction, 5(CSCW2):418:1–418:33, Oct. 202
[29] Not what you’ve
signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt
Injection, May 2023.
[30] ias runs deep: Implicit reasoning biases in persona-assigned llms. arXiv preprint
arXiv:2311.04892, 2023.
[31] A Real-World
WebAgent with Planning, Long Context Understanding, and Program Synthesis, Feb. 2024.
URL http://arxiv.org/abs/2307.12856. arXiv:2307.12856 [cs]
[32] The Future of AI Governance, Apr. 2023.
URL http://arxiv.org/abs/2304.04914
[33] Uncovering bias in large
vision-language models with counterfactuals. arXiv preprint arXiv:2404.00166, 2024
[34] Automatically Auditing Large
Language Models via Discrete Optimization, Mar. 2023. URL http://arxiv.org/abs/2303.
04381.
[35] Deduplicating Training Data Mitigates Privacy Risks
in Language Models. In Proceedings of the 39th International Conference on Machine Learning, pages 10697–10707. PMLR, June 2022.
[36] ChatGPT for good? On opportunities and challenges of large language mod-
els for education. Learning and Individual Differences, 103:102274, Apr. 2023.
[37] Alignment of Language Agents, Mar. 2021. URL http://arxiv.org/abs/2103.14659.
[38] Model Cards for Model Reporting. In Proceedings of the Conference on Fairness,
Accountability, and Transparency, pages 220–229, Jan. 2019
[39] In-context Learning and Induction Heads. Transformer Circuits
Thread, 2022.
[40] Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards
and Ethical Behavior in the MACHIAVELLI Benchmark, Apr. 2023.
[40] Discovering Language Model Behaviors with Model-Written Eval-
uations, Dec. 2022. URL http://arxiv.org/abs/2212.09251.
[41] Discovering Language Model Behaviors with Model-Written Eval-
uations, Dec. 2022. URL http://arxiv.org/abs/2212.09251.
[42] The Coloniality of Data Work in Latin America. In Proceedings of the 2021
AAAI/ACM Conference on AI, Ethics, and Society. ACM, July 2021.
[43] A human rights-based approach to
responsible AI. arXiv preprint arXiv:2210.02667, 2022.
[44] Ai’s regimes of representation: A
community-centered study of text-to-image models in south asia. In Proceedings of the 2023
ACM Conference on Fairness, Accountability, and Transparency, pages 506–517, 2023.
[45] Outsider Oversight: Designing a Third Party
Audit Ecosystem for AI Governance. In Proceedings of the 2022 AAAI/ACM Conference
on AI, Ethics, and Society. ACM, July 2022.
[46] Stop explaining black box machine learning models for high stakes decisions
and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, May 2019.
[47] Structured access: an emerging paradigm for safe AI deployment, Apr. 2022.
URL http://arxiv.org/abs/2201.05159. arXiv:2201.05159 [cs].
[48] The Offense-Defense Balance of Scientific Knowledge: Does Pub-
lishing AI Research Reduce Misuse? In Proceedings of the AAAI/ACM Conference on AI,
Ethics, and Society, AIES ’20, pages 173–179, New York, NY, USA, Feb. 2020.
[49] Detecting
pretraining data from large language models. arXiv preprint arXiv:2310.16789, 2023
[50] Beyond Fair Pay: Ethical Implications of NLP
Crowdsourcing. In Proceedings of the 2021 Conference of the North American Chapter of
the Association for Computational Linguistics: Human Language Technologies, pages 3758–
3769, Online, June 2021. Association for Computational Linguistics
[51] Defining and Character-
izing Reward Hacking. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors, Advances
in Neural Information Processing Systems, 2022.
[52] The Gradient of Generative AI Release: Methods and Considerations, Feb. 2023.
[53] ” kelly is a warm person,
joseph is a role model”: Gender biases in llm-generated reference letters. arXiv preprint
arXiv:2310.09219, 2023
[54] Debiasing
large visual language models. arXiv preprint arXiv:2403.05262, 2024
[55] Universal and transferable adversarial
attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023.