Writing | Nicholas Carlini

By Year
2025
2024
2023
2022
2021
2020
2019
2018

2025

Machines of Ruthless Efficiency: Future LLMs have the potential to cause significant harm due to their ruthless effiency. I'm worried this will happen, and discuss the ways in which it might.

My Thoughts on the Future of "AI": Finally I've written down my thoughts on the future of AI; I think things could either go crazy fast or basically plateau, and wouldn't be surprised by either.

What my privacy papers (don't) have to say about copyright and generative AI: I write papers on privacy. Lawyers cite these papers in copyright cases. Here I explain why this might not be what they want.

Career Update: Google DeepMind -> Anthropic: I'm moving from Google to Anthropic, and in this post explain why.

AI forecasting retrospective: By studying the results of my forecasting survey from last year, I find almost everyone is over confident.

Regex Chess: A 2-ply minimax chess engine written using 84,688 regular expressions.

Letting Language Models Write my Website: I let a language model write my bio. It went about as well as you might expect.

2024

Forecasting the future of AI: A set of 30 questions about the future of AI you can answer, and I'll tell you how you did in a few years.

How I Use "AI": Fifty different examples of how I've used LLMs to meaningfully improve my ability to write code and perform research.

Why I Attack: A response to someone who called me out for not caring about my impact on the world because I like to break things.

(yet another) Broken Adversarial Example Defense at IEEE S&P 2024: I broke another defense to adversarial examples by fixing 1 line of code; in this post I complain about the state of the field of adversarial robustness.

My benchmark for large language models: A benchmark of ~100 tests for language models, collected from actual questions I've asked of language models in the last year.

My Research Idea Logfile, 2016-2019: A description of how I keep track of my research ideas, with my complete log from when I started it in 2016 through to the end of 2019.

2023

Reading Data off an Apple ProFile Hard Drive with an Arduino: A short writeup of how to read data off a 1980s Apple ProFile hard drive using an Arduino.

Playing chess with large language models: I built a bot to play chess by querying a text language model. It sees the sequence of moves in order (as text!), and predicts which move comes next. It's better than me.

Little Bobby <|endoftext|>: I found a fun exploit ChatGPT that causes it to behave weirdly.

A GPT-4 Forecasting Challenge: Test your ability to predict (in a calibrated manner) whether or not GPT-4 can answer a range of questions from coding to poetry to baking.

A ChatGPT clone, in 3000 bytes of C, backed by GPT-2: A dependency-free implementation of GPT-2, including byte-pair encoding and transformer inference, in ~3000 bytes of C. I then use this to create something like Chat GPT.

2022

Reflecting on “Towards Evaluating the Robustness of Neural Networks”: A few thoughts about the paper that brought me into the field of adversarial machine learning.

Rapid Iteration in Machine Learning Research: I wrote a tool to help me quickly iterate on research ideas by snapshoting Python state.

A Case of Plagarism in Machine Learning: A recent paper has copied a bunch of text from over a dozen prior papers. This is bad.

Multiplexing Circuits on the Game of Life - Part 5: Wherein I yet again re-design my game of life circuit setup and make things even more efficient.

Research Paper Release Checklist: Steps to take to reduce the likelihood of embarrassing errors when submitting papers, uploading research papers to arXiv, or submitting final camera-ready papers.

2021

A Simple CPU on the Game of Life - Part 4: A full Turing complete Unlimited Register Machine implemented on top of the game of life.

Yet Another MOBA (In 13kb of JavaScript): an online multiplayer game as part of a series on game-development in 13k of JavaScript.

Improved Logic Gates on Conway's Game of Life - Part 3: more efficient digital logic gates constructed on top of the game of life.

2020

Yet Another Space Game (In 13kb of JavaScript): another small pointless game building on my prior doom clone.

InstaHide Disappointingly Wins Bell Labs Prize, 2nd Place: InstaHide, a recent scheme that claims to train neural networks with privacy, is completely broken but was awarded the Bell Labs Prize, 2nd place.

Screen Recording of Breaking a Defense to Adversarial Examples: I broke another defense, but this time recorded my screen the entire (2.5) hour session it took.

An Introduction to Circuit Design on Conway's Game of Life - Part 2: Basic circuit design to build a 7-segment display using the AND/OR/NOT gates built last time.

Digital Logic Gates on Conway's Game of Life - Part 1: Constructing game of life “gadgets” that act as digital logic gates, allowing Turing-complete computation.

Are Adversarial Example Defenses Improving?: A short collection of thoughts after writing a paper where we broke a dozen recent defenses to adversarial examples, again.

2019

Yet Another Doom Clone (In 13kb of JavaScript): exactly what it sounds like; an entry for js13k 2019.

A 3D Shadow Mapping Renderer in JavaScript: because it's possible.

List of All Adversarial Example Papers: a continuously-updating list of all 1000+ papers written on adversarial examples available on arxiv.

2018

Adversarial Machine Learning Reading List: a collection of papers I recommend reading for those interested in studying adversarial machine learning (for the time being, focusing on the sub-field of adversarial examples).

Advice on Evaluating Adversarial Example Defenses: recommendations for how to perform adversarial example defense evaluations (or how to determine if an evaluation in a defense paper is adequate).