Notes on Vincent Cheng

LLMs are Making Me Dumber

Wed, 14 May 2025 00:00:00 -0800

Here are some ways I use LLMs that I think are making me dumber:

When I want to build a Chrome extension for personal use, instead of actually learning and writing the JavaScript, I Claude-Code the whole thing in a couple of hours without writing a single line of code. Instead of taking the usual route which would leave me with more actual familiarity with JavaScript, I now shortcut the process, leaving me with barely any JS knowledge despite numerous functioning applications.
When I need math homework done fast, I feed in the relevant textbook pages in context, dump my problems into o3/Gemini, and check its answers for sanity instead of doing the problems myself. I cram before tests. (Yes, this is morally dubious and terrible for learning.)
When I need to write an email, I often bullet-point what I want to write and ask the LLM to write out a coherent, cordial email. I’ve gotten worse at writing emails.
My first response to most problems is to ask an LLM, and this might atrophy my ability to come up with better solutions since my starting point is already in the LLM-solution space.

These are all deliberate trade-offs I make for the sake of output speed. By sacrificing depth in my learning, I can produce substantially more work. I’m unsure if I’m at the correct balance between output quantity and depth of learning. This uncertainty is mainly fueled by a sense of urgency due to rapidly improving AI models. I don’t have time to learn everything deeply. I love learning, but given current trends, I want to maximize immediate output. I’m sacrificing some learning in classes for more time doing outside work. From a teacher’s perspective, this is obviously bad, but from my subjective standpoint, it’s unclear.

1% Improvements

Thu, 06 Mar 2025 00:00:00 -0800

Here’s a running list of tiny workflow tips that make my day-to-day noticeably smoother. Most of these are embarrassingly simple, but that’s the point! Habits that seem obvious to me might be totally new to someone else (and vice versa).

I will continually update this every ~month and try to only include things that I have kept up using for more than a month. Message me your favorites and I’ll include them!

Metal Pins Simulation

Sun, 09 Feb 2025 00:00:00 -0800

This web app uses your webcam, a lightweight depth model, and three.js to recreate that oddly satisfying metal pin toy effect. Try it out here. Try out the simulation for yourself here and check out the code here.

15 Questions

Tue, 04 Feb 2025 00:00:00 -0800

What do truly long-context models look like? I want to give the model all my journals, notes, pictures, previous work, etc. so that it can make connections and tailor responses for me. I imagine this to be In-between context stuffing and fine-tuning. Every ~day, the model takes all the conversations from that day and decides which to use to update its weights. In the future, will everyone have custom models? Predictive processing?
What will human-AI collaboration look like in the future?
How much software will humans be writing in three years? What are the comparative advantages of humans?
Is “We don’t need to find the most general, all-modality, solution. We just need to get something good enough to automate research. That’s the goal. After that, there’s a clear path and we’re just on high-level steering.” wrong?
Has someone created a gym environment that is a computer simulation? Actions are anything someone can do on a computer. After each episode, unit tests are run to determine reward. Why are we using screenshots?
How much does o1-style reasoning RL transfer to performing long-horizon tasks for computer use?
I don’t get how we’re passing the synthetic data wall. Yes, you can use o3 outputs to fine-tune 4o and get a really good o3-mini, but can you use oN outputs to get oN+1?
Can you get two models to communicate through residual streams and not text? Or CoT in the latent space instead of writing everything out? Is this desirable? How do you get training data for this? A quick perplexity search gets me these links.
We have text-to-text, text-to-image, text-to-video. What is the SOTA for text-to-action tokens in robots? There must be a way to leverage the understanding of the world language models have to robotics. How?
How much do traders use ML? It seems like a ripe field for it. Lots of money, data, smart people… Everything is probably private.
Why is Moravec’s paradox true?
How is Adam still the best optimizer after 10 years? `
How do lightweight code-generation models like Cursor’s work?
What is going on in interpretability these days?
Why are all the benchmarks in math and coding competitions? What happened to physics?

Ideas

Fri, 17 Jan 2025 00:00:00 -0800

Some ideas I find interesting but don’t have enough time to make a reality.

A “See all context” button. This is especially useful for reasoning models since I want to know which parts of the reasoning were dropped so I know how much context to provide. You would have to drop sensitive information like like system prompts.
JARVIS. Models are capable enough already! We just need some better scaffolding like Cursor and some way to fit in a lot of context. I really liked GPT with scheduled tasks. If you patch 20 of these together, you get a really good assistant. One that would be proactive. I don’t want to ever miss a call again.
Surfing footage drone. Before surfing, I set off a drone which follows me and records cool footage of me catching waves.
Implement needle in a haystack. I swear my experience with using models don’t correspond with the needle in a haystack results they put out. The current “insert random sentence” method doesn’t seem great either. Combining facts from the beginning and the end, simple reasoning steps, and
Machine translation through steering vectors?
Give Claude/GPT/… a decent prompt and scaffolding and let it loose on X.
Taiwanese news is 50% TSMC and gets updates instantly. But it takes a while for this to get to US news outlets. Make a scraper of big Taiwanese outlets and when it’s an article about TSMC, automatically translate it to English and post it somewhere.
https://aayushg.com/ideas
How did my water get here? Enter in a location/building and see where your water source is from. What pipes did your water go through? What water plant is it from? What’s the water source?
A script that someone goes through a bunch of websites and stores 404s or unavailables and emails people that some site is down.

Quotes

Tue, 14 Jan 2025 00:00:00 -0800

The Man In The Arena

It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again, because there is no effort without error and shortcoming; but who does actually strive to do the deeds; who knows great enthusiasms, the great devotions; who spends himself in a worthy cause; who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly, so that his place shall never be with those cold and timid souls who neither know victory nor defeat.

The Best Sport

Sat, 04 Jan 2025 00:00:00 -0800

Brazilian Jiu-Jitsu (BJJ) is a grappling sport which means there is no punching or kicking and you win by gaining dominant positions or submissions such as chokeholds or joint locks.

What Makes BJJ Different?

BJJ is the “broadest” sport I’ve done. It feels like math in the sense that you have so many distinct, but related, concepts to learn and problem-solving to do. In tennis or basketball, I felt like after building up the basics, I was repetitively honing small details to eke out 1-3% improvements (which can definitely still be fun). In BJJ, there are black belts who are still not familiar with many positions. There is definitely still lots of practice spent honing details of specific moves, but it feels like there is more pure learning (compared to refining) going on. Given any move or position, you can break it down into broad principles, details, counters, and counter counters. It’s a lot like physical chess.

Media I've Enjoyed

Thu, 26 Dec 2024 00:00:00 -0800

obviously non-exhaustive

Movies/shows

Monster (Japanese movie): messes with your head. 8/10
Squid Game
The Prestige: It’s so good. I don’t get how it’s not Interstellar-level popular. 10/10
Alice in Borderland
Top Gun
Money Heist
Stranger Things
Arrival
火神的眼淚
模仿犯

Books

Open (Agassi biography)
Three Body Problem series
When Breath Becomes Air
Tomorrow, and Tomorrow, and Tomorrow
How Not to Be Wrong
Norwegian Wood
There is No Antimemetics Division

Thoughts On Cursor

Sat, 09 Nov 2024 00:00:00 -0800

My opinion of whether Cursor helps or hinders my work has fluctuated significantly over the past few months.

Cursor is really powerful. Having never worked with selenium drivers and barely editing any lines of code, I made a pretty nice scraper in two afternoons. It just works.

However, there’s a pretty significant distinction of using AI code editors that I haven’t seen explicitly stated anywhere.

The first way is tabbing to autocomplete code I’ve written hundreds of times, giving me more time for higher-order thinking. This is usually when I’m working with programming languages and projects I’m already familiar with. I dont see any immediate problems with this.

Principle of Least Action

Tue, 05 Nov 2024 00:00:00 -0800

I thought this video was really fun and wrote up some of the derivations that the video went over quickly.

Introduction and Basic Principles

Maupertuis’ principle of least action states that the action, defined as:

$$ S_0 = \sum mvs $$

where $m$ is the mass, $v$ is the velocity, and $s$ is the distance, reaches a minimum along the actual path of motion.

Euler later generalized this to a continuous form:

Weird Things in High Dimensions

Tue, 05 Nov 2024 00:00:00 -0800

Some weird stuff happens in high-dimensions.

Reference

1. High Dimensional Oranges Are Almost All Peel

Consider an $n$-dimensional cube of side length 1 containing a smaller $n$-dimensional cube with side length $0.8$ (“pulp”) surrounded by a $0.1$-width border (“peel”).

The volume of the pulp is $0.8^n$, which rapidly approaches 0 as $n$ increases:

Dimensions	Pulp Volume ($0.8^n$)
1	0.800
2	0.640
3	0.512
5	0.328
10	0.107
20	0.012
50	0.000014

Another perspective: To randomly sample a point in this cube, we select $n$ independent coordinates from $[0,1]$. The point lies in the pulp only if all coordinates fall within $(0.1, 0.9)$. This probability is $(0.8)^n$, approaching 0 as $n$ increases.

Watch More YouTube

Fri, 20 Sep 2024 00:00:00 -0800

Many people have written about curating a very good Twitter feed, but I have yet to see anyone talk about doing this with Youtube. I don’t know if many people do it but don’t talk about it, don’t do it, or what. I guess “subscribe to tons of accounts you enjoy, use the “Not Interested” for ones you want to avoid, and harvest” is not very deep.

YouTube’s algorithm is very good at recommending content you would enjoy and I’ve more consciously used that to curate my YouTube feed towards content I enjoy (mostly technical). I’ve discovered high-quality channels with fewer than 1k subscribers that wouldn’t have appeared in my feed if I hadn’t been more intentional. As a result, my YouTube feed feels like a mix of a science fair, machine learning conference, math club, hackathon, and symposium.

Podcast Notes

Sun, 28 Jul 2024 00:00:00 -0800

Noam Shazeer and Jeff Dean on Dwarkesh

arithmetic very cheap. moving data around is expensive.
model parameters are very memory efficient:
one fact per parameter? (this probably isn’t the right way to think about it because of superposition?) versus in context, there are kqv which can many more bits
inference improvement thing? big model verifier, small model does it first thing?? “drafter models”. are these real? i don’t see how these parallelize. oh wait no you can batch it so it goes drafter -> actual -> drafter -> actual …

What is this?

Thu, 25 Jul 2024 00:00:00 -0800

I want this to be an informal/public notebook where I record thoughts that are too long for a non-premium Twitter account, notes on different things I’m reading, and maybe more formal writings as well. The target audience for this page is a mix of myself and current/potential friends.

Through this notebook, I hope to “produce” more and write better. For the longest time, I’ve been thinking my consuming to producing ratio has been higher than I would like, and hence I’m forcing myself to do more frequent, scrappy writeups (also Learning in Public). Also, friends have told me about how valuable writing well is yet I’ve never actually written much outside of school. Writing more, and in public, will hopefully speedrun me becoming a better writer (please give me feedback if you have any!).