Interpreting Large Language Models

We share initial thoughts on how to peer inside transformer networks.