currently the formatting is a bit broken since i need to make it respect single newlines and i also need to make it latex the dollar signs. mb gang.

caps lock

9/10/25 I noticed caps lock is asymmetric on my computer (linux): caps lock activates on key down, and deactivates on key up.

more clearly, if caps lock is off, and I hold it and press a letter, caps lock will be on already and the letter will be capitalized. but if caps lock is already on, and I hold it and press a letter, caps lock will stay on and the letter will be capitalized. this is odd. I'd expect both halves of the toggle to be the same - either key up or key down. so why isn't it?

apparently this has been defended by linux devs are "historically correct". see this remarkably unproductive discussion on the manjaro forums. I also found a number of linux users complaining about this, since some people tap caps lock twice when they want to capitalize a letter, instead of using shift. In those cases this behavior makes them more likely to also capitalize the second letter of the word. here's one example from the debian forums.

afaik windows and mac don't have this behavior, but I haven't checked. I'm not a fan of caps lock (is it too obvious?) and I prefer to remap it with autohotkey in windows. I would remap it on linux too, but only X lets you do fun keyboard remaps and automation easily; the wayland solutions all seem not great and not worth both the effort and security risk.

I also found this odd post saying that some people have to press shift to turn off caps lock. in other keyboard news, there was this lovely HN post "The Day Return Became Enter" of a 2023 article. keyboards are so cool. I get more into computer history every day. actually, I was just reading about how linux's psuedo-terminal system works and there's some historical quirks there too that I should write about since this thing is now for ramblings. but then again, I'm the audience for now and it's notes to me so if you're not me and reading this, you'll just have to deal with my lack of effort/polish :)

ramblings

9/10/25 today I'm giving this an update. It's no longer going to be just math, instead it's going to be a reverse chronological dump of interesting things I've encountered. Mostly math and CS for now, I'm sure, but hopefully plenty of randomness and fun stuff. maybe some linkposts too.

why RH gives bounds

9/8/25 so I got curious and tried to look into the proportion of squarefree numbers. apparently it's yet another thing linked to the Riemann hypothesis, namely in the fact that you can easily bound by $\sqrt{N}$ but assuming RH you get a bound of $N^\frac{1}{4}$. the unconditional best known bound is $N^{\frac{11}{35}}$ I think? average number theory result, whatever, at least there's no $\log \log \log n$ in there.

anyway, it's like "why? why is RH sticking its nose in everybody's business?" so here's one answer:

look at the function $Q(N) = \sum_{n\leq N} \mu^2(n)$, which is the number of squarefree numbers up to $N$

it gives the Dirichlet series $F(s) = \sum_{n=1}^\infty \frac{\mu^2(n)}{n^s}$, which turns out to be $\frac{\zeta(s)}{\zeta(2s)}$, related to the fact that the Dirichlet series for $\mu(n)$ is $\frac{1}{\zeta(s)}$.

then, to get asymptotics for partial sums of coefficients of $F(s)$, we use Perron’s formula: $$ Q(N) = \frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty} F(s)\frac{N^s}{s}, ds. $$ this is a bit magic and I didn't look into how it works because it is 2:30 in the morning. thn the idea is to start with $c > 1$, where the series converges, and shift the contour left into the critical strip ($0 < \Re(s) < 1$), trying to capture residues of poles (like the pole of $\zeta(s)$ at $s=1$) while keeping control of what happens along the new contour.

The error term depends on how far you can push the contour left, and how well you can bound $F(s)$ on that new line. since $F(s) = \zeta(s)/\zeta(2s)$ has poles at zeros of $\zeta(2s)$, the zeta zeroes are our obstruction (every zero of $\zeta$ spawns a singularity of $F$ halfway over).

without RH we only know crude zero-free regions (e.g. de la Vallée Poussin’s region $\Re(s) > 1 - c/\log |t|$), which somehow lets you push the contour just a little left of 1, but no further without risk of crossing zeros. with RH you know all zeros lie on $\Re(s)=1/2$. Then you can push the contour all the way to $\Re(s)=1/2$ safely. then I suppose the natural size of the integral when evaluated along $\Re(s)=1/2$ works out to $N^{1/4+\varepsilon}$.

one can ask if this can be improved RH only pins zeros on the critical line, not how they are distributed there, so in principle, they could cluster / have big gaps. this matters because in the Perron formula integral, the oscillations you get in the error term come from contributions of terms like $N^{\rho/2}$ where $\rho$ is a zeta zero on the critical line. so if the zeroes "conspire" somehow they can add spikes in the error, but if they're regular you get lower error.

apparently we don't know too much on zeta zero spacing statistics? I saw one link on Montgomery’s pair correlation conjecture and another on random matrix theory heuristics, but didn't look into either of them yet. my impression is that people think $N^{1/4}$ is the natural barrier (under RH).

mobius function = euler characteristic

I asked chatgpt why mobius function appears everywhere instead of, for example, the liouville function the main answer seems to be the inversion formula; one should look at it not as a function but as a convolution kernel that inverts the sum over divisors (which seems somehow like a number theoretic integral?) in this vein, one can recursively define the mobius function of a(n interval of a locally finite) poset, and you get an inversion theorem there too a nice result here then is that if you look at a finite dim simplicial complex as a poset of chains (this is by inclusion right?) then the reduced euler characteristic = alternating sum of reduced homology is gonna be the same as its mobius function (when you look at the whole thing as an interval)

Merkle–Hellman knapsack cryptosystem

I was linked to the partially unredacted NSA document "50 years of mathematical cryptography" from bruce schneier's blog. in the appendix, the author described a public key cryptosystem based on a superincreasing set, where each integer of the set (in order) is greater than the sum of all previous ones. I just looked it up and apparently it's insecure, but I'll describe it briefly: the hard problem is that given a set and a desired sum, find a subset which sums to the desired sum. clearly easy for a superincreasing set, NP-hard for a random set. the idea is that the superincreasing set (the private key) can be transformed back and forth (the trapdoor) to a non-superincreasing set (the public key). suppose you want to encrypt n bits; pick a RANDOM superincreasing set W of size n, random q bigger than the sum of all of them, random r coprime to q, and the public key B is $b_{i}=rw_{i}\mod q$. to decrypt a ciphertext which is a sum of some of the $b_{i}$, we just multiply it by $r^{-1}\mod q$ and now we have a sum of some of the $w_{i}$, which is easy. although it's solved, it's interesting to note that this cryptosystem is a kind of commutative square. namely, the transformation from private to public key is the same as from private to public ciphertext, given by multiplying by $r$. in the other direction of the arrows, it's summing / solving the sum problem. I wonder if the other cryptosystems I know, RSA and diffie-hellman, are also somehow commutative. DH is symmetric though, so maybe not that. what about RSA?

spaces and geometries

convergence in measure of CDFs

notice for random variables we have two notions of convergence which treat them like measurable functions: convergence ae, convergence in probability ($\mathbb{P}(|X_{n}-X|>\varepsilon)\to 0$ for all $\varepsilon >0$) and one notion of convergence which doesn't care about their base space: convergence in distribution, which is pointwise convergence of the CDF at all continuity points.

  1. (ANS: NO) since the CDF is a measurable function, we can ask if convergence ae and convergence in measure (locally or globally) are meaningful for CDFs.
    • best approach here is probably to start by constructing examples that converge/don't converge in this sense while converging/not converging in the standard senses
  2. (NOT POSSIBLE) we can ask if convergence in probability can be expressed in terms of CDFs, so that it becomes independent of the base space.
    • one approach might be to turn the r.v. into a r.v. with base space $[0,1]$ using the inverse of the CDF, and ask if convergence in probability of these new equivalent r.v.s is the same as convergence in probability of the original r.v.s
  3. (MOOT) if we are successful on the above points and convergence properties of the original r.v.s translate to convergence properties of the CDFs, then it might be plausible to translate convergence properties "backwards" as well, in some way, so that convergence ae for CDF => some weird kind of convergence for r.v.s is analagous to the implication convergence ae for r.v.s => the same weird kind of convergence for a new and hopefully interesting type of object
    • this is sketchier and should be attempted last
  4. see below there's also a few notions of convergence of measures https://en.wikipedia.org/wiki/Convergence_of_measures, which notes that convergence in distribution is weak convergence of the pushforward measure. it would be good to devise as many measures of convergence as you can think of and figure out how they relate and why certain ones are more useful than others. /////////// okay I figured out (2), I guess. convergence in distribution is basically an ae convergence of a the MONOTONIZATIONS of the random variables. by this I mean consider the rv with the same law as $X_n$ given by $F_{n}^{-1}:[0,1] \to \mathbb{R}$; this is basically (up to some unimportant things like the shape of the domain and continuity points) the unique monotonic function with this law, and $F_{n}$ converging at all continuity points means the same thing as the monotonization converging at all continuity points. but it should be clear that this completely destroys ALL the information present in the events of $X$ in the sense that we are treating every event $\omega$ in the probability space as exactly the same and interchangeable, which is generally a big loss of information. in other words, this sorting process which allows us to monotonize DEPENDS ON $n$ in a fundamentally unpredictable way; we can look at the monotonizations as carrying half the data of two maps; one being the monotonization map $M_{n}:\Omega \to [0,1]$ and the second being the random variables $F_{n}^{-1}:[0,1]\to \mathbb{R}$. Convergence in distribution completely determines the behavior of the second map, and says exactly nothing about the first map; we are free to choose it at will. However, convergence ae and convergence in probability make strong claims about this first map; they imply that it converges, since they are convergence conditions on $M \circ F^{-1}$. I think the interpretation that each individual event will either settle down to a nice place in $[0,1]$ almost surely or with probability 1 should still hold. As an example, consider the CLT for coin flips with values -1 and 1: each individual random walk never settles down, with its limsup and liminf diverging, $M_{n}$ will almost surely map it to every point of $[0,1]$ infinitely often, yet the second map settles down. Conversely, with the LIL we see that even since every event will have limsup 1 and liminf -1, $M_{n}$ should still map every single event to every point of $[0,1]$ almost surely. However, since the distribution concentrates all mass on a single point (zero, since the distribution is normal divided by $\log \log n$), $M_{n}$ has to map all the probability mass of $\Omega$ to zero, so it converges in probability as well. This is unusual; if the distribution had concentrated all the mass at more than one point, $M_{n}$ would have two choices in $[0,1]$ on where to map all the probability mass, and could switch it around all the time and not have to converge in probability. it's like some sort of forwards vs backwards probability mass: the preimage of each of the points might have mass 0.5, but going forwards there's no way to pick some set of events of half the mass which go to that point in any reasonable way.

this breakdown $F^{-1}\circ M$ also sheds some light on the other questions. If we want $F$ to converge in other weird senses, these give conditions on $F^{-1}$ but not on $M$, and so we can say nothing about convergence of the random variables in probability, in $L^{p}$, or almost everywhere, as these require $M$ to not be bad. so (3) is moot and for (1) the answer is that weaker notions of convergence of $F$ are probably uninteresting. Actually, since each $F_{n}$ is monotonic these might all end up being the same? try to check this.

As a new question, we can ask 4) what do convergence conditions on just $M$ look like? can we make a table of convergence conditions on $M$ vs convergence conditions on $F$ and see which various convergence conditions on $M \circ F$ we might get? are any of the normal convergence conditions not possible to split into separate conditions on $M$ and $F$?

incomparable turing degrees

it seems like a non-forcing approach to incomparable turing degrees would be some sort of counting thing:

categoricity of Th(N)

so PA has lots of countable nonstandard models, but maybe we can get rid of that by adding some axioms? (lowenheim-skolem says we have to keep some nonstandard models so let's only try to kill countable ones.) We can start by adding all the axioms, looking at the complete theory of true arithmetic Th(N). One would really hope this is $\aleph_{0}$-categorical, ie there's nothing elementarily equivalent to but not isomorphic to $\mathbb{N}$. but looking on the wikipedia theory says - this is false!! there are $2^\kappa$ many nonisomorphic models for each cardinal $\kappa$! how cooked is that?? very sad. this is apparently what it means to be an "unstable theory".

wang tiles

question: what is the arithmetic complexity of the set of valid sets of wang tiles?

two funky things in computability theory

pareto principle

the pareto principle says "80% of the gains come from 20% of the effort" my understanding is that the pareto distribution has PDF $t^k$ when $t \ge t_{0}$ and $0$ for $t < t_{0}$ with parameters $k<-1$ and $t_{0} \in \mathbb{R}$. here 20% of the effort (time) would be infinite, so I would say the claim to hope for here is that on any interval not containing $t_{0}$, 80% of the mass of the PDF in the interval is contained in the first 20% of that interval. is that true for all $k$? any $k$? let's see. $\int_{t_{0}}^{t_{1}} t^{-k}\ \mathrm{d}t = t_{1}^{-k+1} -t_{0}^{-k+1}$, so we want $0.8(t_{1}^{-k+1}-t_{0}^{-k+1})=(0.2t_{1}+0.8t_{0})^{-k+1}-t_{0}^{-k+1}$ I'll call $-k+1,t_{0},t_{1}$ as $u,a,b$ for convenience. $0.8b^{u}-0.8a^{u}+a^{u}=(0.2b+0.8a)^{u}$ $(0.2b+0.8a)^{u}=0.8b^{u}+0.2a^{u}$ via desmos, for $u=-1$ we get a line at $b=16a$ in addition to $b=a$. so not quite everything. can we maybe set up a differential equation for this? don't need a diffeq actually, it's a functional equation $0.8F(b)-0.8F(a)=F(0.2b+0.8a)-F(a)$ what sorts of functions satisfy this? wikipedia has the $u$ that works at https://en.wikipedia.org/wiki/Pareto_distribution#Relation_to_the_%22Pareto_principle%22

a youtube comment

For a more intuitive sense of where the E8 lattice comes from, there is an exceptional series of polytopes discovered by Thorold Gosset over 100 years ago, known as the k_21 semiregular figures (but I like to call them kaleidoplexes) that leads up to it. These objects can be understood as being built from simplexes and orthoplexes, arranged around relevant facets in 1:2 ratio.

Meanwhile, the circumradius relative to edge length has been steadily increasing from √21/6 (~0.764) for the prism, through √15/5, √10/4, √6/3, √3/2, (notice the triangular numbers) and finally to 1 for the 8-kaleidoplex. Additionally, the 2 types of internal angles between sides have also been increasing. This is where the E8 lattice finally appears. Rather than there being a proper 9-kaleidoplex, the 5_21 is actually a tesselation of 8D space whose vertex figure is the 4_21. It is still built from 8-simplexes and 8-orthoplexes in a 1:2 ratio around each 6-simplex facet, but the 2 angle types are now 180 degrees, hence why it can't "fold up" into a convex object in 9D space. While most people are aware of the Platonic solids, these E-type objects––along with the F-type 24-cell in 4D, the G-type hexagon in 2D, the non-crystallographic H-type pentagonal polytopes in just 2, 3, and 4 dimensions, and the 4 infinite families of A-type simplexes, B-type measures (the hypercubes), C-type orthoplexes (technically also B, and not properly C until considering root systems), and D-type demi-hypercubes (which I prefer to call alterplexes)––fill out a larger classification structure for the possible types of symmetries, one which is directly related to the classification of finite simple groups, too.

huh? try to understand at some point also see https://en.wikipedia.org/wiki/Exceptional_object#/media/File:Exceptionalmindmap2.png these are comments from aleph 0's video on the E8 sphere packing

wang tiles in computable hierarchy

first code all the sets of wang tiles as natural numbers. that is there is some computable function $\text{tiles}:\mathbb{N}\to \text{Wang tiles}$ for which $\text{tiles}(n)$ is a unique set of Wang tiles up to isomorphism.

then, there is some subset $A \subseteq \mathbb{N}$ corresponding to wang tiles which can tile the plane. what is the formula for $A$? what is its complexity?

$n\in A$ means:

measure zero vs meagre

https://mathoverflow.net/questions/43478/is-there-a-measure-zero-set-which-isnt-meagre

Justin's rep theory talk

Aren's rotation number talk

talk 4/3

fractional quantum hall effect

metrization theorems

some notes on electricity and pauli's exclusion prinicple

prompted by this blogpost from HN: https://lcamtuf.substack.com/p/but-good-sir-what-is-electricity electrons flow in a wire is certainly something to investigate - our electrical and plumbing systems are to me the two pillars of modern amenities. i've phrased that badly, but in any case, electrons in metals. let's do it.

why are standard borel spaces determined by their cardinality?

don't forget to read the CT thing in the Baez link

scissors congruence sum talk

lizzie's research

symbolic dynamics pitch from albert at pcmi

  1. topology, metric space: SFT
  2. space of bi-infinite path in some groups
  3. incidence matrix
  4. information theory invariant - entropy

pcmi talk from scott aaronson

link from hopf algebras to knot theory

  1. start with a Hopf algebra, for instance a quantum group
    • a Hopf algebra is a (unital associative) algebra and compatibly a (counital coassociative) coalgebra with an antipode map
    • you should think of comultiplication as like a generalized version of the diagonal map? on the algebras themselves it's like a tensor product
    • the antipode map from the Hopf algebra to itself is an antihomomorphism (reverses the order of multiplication) such that the hexagon diagram commutes
    • the antipode map determines its bialgebra up to isomorphism, so being a Hopf algebra is a property of a bialgebra, namely admitting an antipode, not really additional structure
    • example: group algebra with comultiplication is diagonal map / tensor product, counit is $g \mapsto 1$, antipode is inverse map
    • more examples: functions $K^G$ from finite group to field, representative functions on compact group, regular function on algebraic group, tensor algebra $T(V)$, universal enveloping algebra $U(\mathfrak{g})$, sweedler's hopf algebra $H=K[c,x]/(c^{2}=1,x^{2}=0,xc=-cx)$.
    • these examples are either commutative or cocommutative (?), but we can "deform" or "quantize" these to get a special type of hopf algebra called a quantum group. no agreed-on formal definition yet, but the idea is that a standard algebraic group is well described by its standard hopf algebra of regular functions, while the deformed hopf algebra describes non-standard / quantized algebraic group (so not an algebraic group). instead of trying to manipulate these weird group-like things, work with their hopf algebras instead.
  2. categiry if representations = ribbon category (morphisms are isotopy classes of framed links, morphisms = tangles) = braided monoidal category (Vect-enriched)

  3. braiding is matrix
    • $R:V \otimes W \to W \otimes V$ satisfies some axioms
  4. add trace at each pt

  5. knot invariant
    • put the picture of gru (step 5: profit??)

Q: What does the set of "maximally measurable" functions look like?

LEDs


moduli spaces

definition by example think about grassmanian, we have an object and a bundle over that object (each point of Gr represents a subspace so the bundle is the disjoint union of all subspaces) moduli space M is the base space for a universal family U definition 1


some finite group theory

derived series = composition series made out of taking commutator over and over; $G_{i+1}= [G_{i},G_{i}]$, $G_{0}=G$.

need to read proof of jordan holder

some properties of groups

more on perfect groups

other types of subgroup series


some probability theory

stopping times