Follow
Dylan Slack
Title
Cited by
Cited by
Year
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods
D Slack, S Hilgard, E Jia, S Singh, H Lakkaraju
AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES), 2020
9932020
Reliable Post hoc Explanations: Modeling Uncertainty in Explainability
D Slack, S Hilgard, S Singh, H Lakkaraju
NeurIPS, 2021
2152021
Counterfactual Explanations Can Be Manipulated
D Slack, S Hilgard, H Lakkaraju, S Singh
NeurIPS, 2021
1582021
Rethinking Explainability as a Dialogue: A Practitioner's Perspective
H Lakkaraju, D Slack, Y Chen, C Tan, S Singh
HCAI @ NuerIPS, 2022
942022
Explaining machine learning models with interactive natural language conversations using TalkToModel
D Slack, S Krishna, H Lakkaraju, S Singh
Nature Machine Intelligence 5 (8), 873-883, 2023
85*2023
Fairness Warnings and Fair-MAML: Learning Fairly with Minimal Data
D Slack, S Friedler, E Givental
ACM Conference on Fairness, Accountability and Transparency (FAccT), 2020
692020
Assessing the Local Interpretability of Machine Learning Models
D Slack, SA Friedler, C Scheidegger, C Dutta Roy
Workshop on Human Centric Machine Learning, NeurIPS, 2019
67*2019
Differentially Private Language Models Benefit from Public Pre-training
G Kerrigan, D Slack, J Tuyls
EMNLP PrivateNLP Workshop, 2020
602020
Post hoc explanations of language models can improve language models
S Krishna, J Ma, D Slack, A Ghandeharioun, S Singh, H Lakkaraju
NeurIPS, 2023
522023
A Careful Examination of Large Language Model Performance on Grade School Arithmetic
H Zhang, J Da, D Lee, V Robinson, C Wu, W Song, T Zhao, P Raja, ...
NeurIPS, 2024
352024
On the Lack of Robust Interpretability of Neural Text Classifiers
MB Zafar, M Donini, D Slack, C Archambeau, S Das, K Kenthapadi
Findings of ACL, 2021
192021
Active Meta-Learning for Predicting and Selecting Perovskite Crystallization Experiments
V Shekar, G Nicholas, MA Najeeb, M Zeile, V Yu, X Wang, D Slack, Z Li, ...
The Journal of Chemical Physics, 2021
172021
Tablet: Learning from instructions for tabular data
D Slack, S Singh
arXiv preprint arXiv:2304.13188, 2023
92023
Feature attributions and counterfactual explanations can be manipulated
D Slack, S Hilgard, S Singh, H Lakkaraju
arXiv preprint arXiv:2106.12563, 2021
82021
SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition
D Slack, Y Chow, B Dai, N Wichers
DARL @ ICML, 2022
52022
Context, language modeling, and multimodal data in finance
S Das, C Giggins, J He, G Karypis, S Krishnamurthy, M Mahajan, ...
42021
Defuse: Harnessing Unrestricted Adversarial Examples for Debugging Models Beyond Test Accuracy
D Slack, N Rauschmayr, K Kenthapadi
NeurIPS XAI4Debugging Workshop, 2021
4*2021
Robust Interactions with Machine Learning Models
D Slack
University of California, Irvine, 2023
22023
Learning Goal-Conditioned Representations for Language Reward Models
V Nath, D Slack, J Da, Y Ma, H Zhang, S Whitehead, S Hendryx
NeurIPS, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–19