Socially Responsible Language Modelling Research (SoLaR) 2024

Contact: solar-neurips@googlegroups.com.

Key Dates

Submissions Open	TBD
Submission Deadline	August 30, 2024, AoE
Acceptance Notification	October 14, 2024, AoE
Camera-Ready Deadline	TBD
Workshop Date	December 14 or 15, 2024

All deadlines are specified in AoE (Anywhere on Earth).

Description/Call For Papers

The Socially Responsible Language Modelling Research (SoLaR) workshop at NeurIPS 2024 is an interdisciplinary gathering that aims to foster responsible and ethical research in the field of language modeling. Recognizing the significant risks and harms [33-37] associated with the development, deployment, and use of language models, the workshop emphasizes the need for researchers to focus on addressing these risks starting from the early stages of development. The workshop brings together experts and practitioners from various domains and academic fields with a shared commitment to promoting fairness, equity, accountability, transparency, and safety in language modeling research.

We anticipate a broad array of submissions given the multi-faceted impact of LMs. Our focus areas will include but are not limited to:

Security and privacy concerns of LMs [10, 17, 15]
Bias and exclusion in LMs [9, 2]
Analysis of the development and deployment of LMs, including crowdwork [26, 30], deployment protocols [32, 28], and societal impacts from deployment [8, 13].
Safety, robustness, and alignment of LMs [31, 6, 20, 19]
Auditing, red-teaming, and evaluations of LMs [25, 24, 16]
Transparency, explainability, interpretability of LMs [23, 12, 3, 27, 14, 22]
Applications of LMs for social good, including sector-specific applications [7, 18, 11] and LMs
for low-resource languages [4, 5, 21]
Perspectives from other domains that can inform socially responsible LM development and deployment [29, 1]

We also encourage sociotechnical submissions from other disciplines such as philosophy, law, and policy, in order to foster an interdisciplinary dialogue on the societal impacts of LMs.

References

[1] The Grey Hoodie Project: Big Tobacco, Big Tech, and the Threat on Academic Integrity. In AIES 2021.
[2] Persistent Anti-Muslim Bias in Large Language Models. In AIES 2021.
[3] Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation. In ICLR 2022.
[4] A Few Thousand Translations Goa Long Way! Leveraging Pre-trained Models for African News Translation. In NAACL 2022.
[5] MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition. In EMNLP 2022.
[6] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback, Apr. 2022. URL http://arxiv.org/abs/2204.05862. arXiv:2204.05862.
[7] Fine-tuning language models to find agreement among humans with diverse preferences. In NeurIPS 2022.
[8] Can Language Models Be Too Big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2021.
[9] Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets. In roceedings of the 59th Annual Meeting of the Association for Computational Linguistic, August 2021.
[10] What Does it Mean for a Language Model to Preserve Privacy? In 2022 ACM Conference on Fairness, Accountability, and Transparency. ACM, June 2022.
[11] Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings. In 2019 NAACL, June 2019.
[12] Towards A Rigorous Science of Interpretable Machine Learning, Mar. 2017. URL http://arxiv.org/abs/1702.08608.
[13] Predictability and Surprise in Large Generative Models. In 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, June 2022.
[14] Datasheets for datasets. Communications of the ACM, 64(12):86–92, Dec. 2021.
[15] Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection, May 2023. URL http://arxiv.org/abs/2302.12173. arXiv:2302.12173.
[16] Automatically Auditing Large Language Models via Discrete Optimization, Mar. 2023. URL http://arxiv.org/abs/2303.04381. arXiv:2303.04381.
[17] Deduplicating Training Data Mitigates Privacy Risks in Language Models. In ICML 2022.
[18] ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103:102274, Apr. 2023. URL https://www.sciencedirect.com/science/ article/pii/S1041608023000195.
[19] Alignment of Language Agents, Mar. 2021. URL http://arxiv.org/abs/2103.14659. arXiv:2103.14659.
[20] S. Lin, J. Hilton, and O. Evans. TruthfulQA: Measuring How Models Mimic Human Falsehoods. In ACL 2022.
[21] Challenges of language technolo-gies for the indigenous languages of the Americas. In ACL 2018.
[22] Model Cards for Model Reporting. In FAacT 2019.
[23] In-context Learning and Induction Heads. Transformer Circuits Thread, 2022.
[24] Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark, Apr. 2023. URL http://arxiv.org/abs/2304.03279. arXiv:2304.03279.
[25]iscovering Language Model Behaviors with Model-Written Evaluations, Dec. 2022. URL http://arxiv.org/abs/2212.09251. arXiv:2212.09251.
[26] The Coloniality of Data Work in Latin America. In AIES 2021.
[27] Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, May 2019.
[28] Structured access: an emerging paradigm for safe AI deployment, Apr. 2022. URL http://arxiv.org/abs/2201.05159. arXiv:2201.05159.
[29] The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse? In AIES ’20Feb. 2020.
[30] Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing. In NAACL 2021.
[31] Defining and Characterizing Reward Hacking. In NeurIPS, 2022.
[32] The Gradient of Generative AI Release: Methods and Considerations, Feb. 2023. URL http://arxiv.org/abs/2302.04844. arXiv:2302.04844.
[33] On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In FAccT 2021. i
[34] Taxonomy of Risks posed by Language Models. In FAccT 2022.
[35] Sociotechnical Harms: Scoping a Taxonomy for Harm Reduction, Oct. 2022. URL http://arxiv.org/abs/2210.05791.
[36] Predictability and Surprise in Large Generative Models. In FAccT 2022.
[37] Emergent Abilities of Large Language Models. In TMLR 2022.

Sponsors

Key Dates

Description/Call For Papers

References