currently the formatting is a bit broken since i need to make it respect single newlines and i also need to make it latex the dollar signs. mb gang.
caps lock
9/10/25
I noticed caps lock is asymmetric on my computer (linux): caps lock activates on key down, and deactivates on key up.
more clearly, if caps lock is off, and I hold it and press a letter, caps lock will be on already and the letter will be capitalized. but if caps lock is already on, and I hold it and press a letter, caps lock will stay on and the letter will be capitalized. this is odd. I'd expect both halves of the toggle to be the same - either key up or key down. so why isn't it?
apparently this has been defended by linux devs are "historically correct". see this remarkably unproductive discussion on the manjaro forums. I also found a number of linux users complaining about this, since some people tap caps lock twice when they want to capitalize a letter, instead of using shift. In those cases this behavior makes them more likely to also capitalize the second letter of the word. here's one example from the debian forums.
afaik windows and mac don't have this behavior, but I haven't checked. I'm not a fan of caps lock (is it too obvious?) and I prefer to remap it with autohotkey in windows. I would remap it on linux too, but only X lets you do fun keyboard remaps and automation easily; the wayland solutions all seem not great and not worth both the effort and security risk.
I also found this odd post saying that some people have to press shift to turn off caps lock. in other keyboard news, there was this lovely HN post "The Day Return Became Enter" of a 2023 article. keyboards are so cool. I get more into computer history every day. actually, I was just reading about how linux's psuedo-terminal system works and there's some historical quirks there too that I should write about since this thing is now for ramblings. but then again, I'm the audience for now and it's notes to me so if you're not me and reading this, you'll just have to deal with my lack of effort/polish :)
ramblings
9/10/25
today I'm giving this an update. It's no longer going to be just math, instead it's going to be a reverse chronological dump of interesting things I've encountered. Mostly math and CS for now, I'm sure, but hopefully plenty of randomness and fun stuff. maybe some linkposts too.
why RH gives bounds
9/8/25
so I got curious and tried to look into the proportion of squarefree numbers. apparently it's yet another thing linked to the Riemann hypothesis, namely in the fact that you can easily bound by $\sqrt{N}$ but assuming RH you get a bound of $N^\frac{1}{4}$. the unconditional best known bound is $N^{\frac{11}{35}}$ I think? average number theory result, whatever, at least there's no $\log \log \log n$ in there.
anyway, it's like "why? why is RH sticking its nose in everybody's business?"
so here's one answer:
look at the function $Q(N) = \sum_{n\leq N} \mu^2(n)$, which is the number of squarefree numbers up to $N$
it gives the Dirichlet series $F(s) = \sum_{n=1}^\infty \frac{\mu^2(n)}{n^s}$, which turns out to be $\frac{\zeta(s)}{\zeta(2s)}$, related to the fact that the Dirichlet series for $\mu(n)$ is $\frac{1}{\zeta(s)}$.
then, to get asymptotics for partial sums of coefficients of $F(s)$, we use Perron’s formula:
$$
Q(N) = \frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty} F(s)\frac{N^s}{s}, ds.
$$
this is a bit magic and I didn't look into how it works because it is 2:30 in the morning.
thn the idea is to start with $c > 1$, where the series converges, and shift the contour left into the critical strip ($0 < \Re(s) < 1$), trying to capture residues of poles (like the pole of $\zeta(s)$ at $s=1$) while keeping control of what happens along the new contour.
The error term depends on how far you can push the contour left, and how well you can bound $F(s)$ on that new line. since $F(s) = \zeta(s)/\zeta(2s)$ has poles at zeros of $\zeta(2s)$, the zeta zeroes are our obstruction (every zero of $\zeta$ spawns a singularity of $F$ halfway over).
without RH we only know crude zero-free regions (e.g. de la Vallée Poussin’s region $\Re(s) > 1 - c/\log |t|$), which somehow lets you push the contour just a little left of 1, but no further without risk of crossing zeros. with RH you know all zeros lie on $\Re(s)=1/2$. Then you can push the contour all the way to $\Re(s)=1/2$ safely. then I suppose the natural size of the integral when evaluated along $\Re(s)=1/2$ works out to $N^{1/4+\varepsilon}$.
one can ask if this can be improved RH only pins zeros on the critical line, not how they are distributed there, so in principle, they could cluster / have big gaps. this matters because in the Perron formula integral, the oscillations you get in the error term come from contributions of terms like $N^{\rho/2}$ where $\rho$ is a zeta zero on the critical line. so if the zeroes "conspire" somehow they can add spikes in the error, but if they're regular you get lower error.
apparently we don't know too much on zeta zero spacing statistics? I saw one link on Montgomery’s pair correlation conjecture and another on random matrix theory heuristics, but didn't look into either of them yet. my impression is that people think $N^{1/4}$ is the natural barrier (under RH).
mobius function = euler characteristic
I asked chatgpt why mobius function appears everywhere instead of, for example, the liouville function
the main answer seems to be the inversion formula; one should look at it not as a function but as a convolution kernel that inverts the sum over divisors (which seems somehow like a number theoretic integral?)
in this vein, one can recursively define the mobius function of a(n interval of a locally finite) poset, and you get an inversion theorem there too
a nice result here then is that if you look at a finite dim simplicial complex as a poset of chains (this is by inclusion right?) then the reduced euler characteristic = alternating sum of reduced homology is gonna be the same as its mobius function (when you look at the whole thing as an interval)
Merkle–Hellman knapsack cryptosystem
I was linked to the partially unredacted NSA document "50 years of mathematical cryptography" from bruce schneier's blog. in the appendix, the author described a public key cryptosystem based on a superincreasing set, where each integer of the set (in order) is greater than the sum of all previous ones.
I just looked it up and apparently it's insecure, but I'll describe it briefly: the hard problem is that given a set and a desired sum, find a subset which sums to the desired sum. clearly easy for a superincreasing set, NP-hard for a random set. the idea is that the superincreasing set (the private key) can be transformed back and forth (the trapdoor) to a non-superincreasing set (the public key).
suppose you want to encrypt n bits; pick a RANDOM superincreasing set W of size n, random q bigger than the sum of all of them, random r coprime to q, and the public key B is $b_{i}=rw_{i}\mod q$.
to decrypt a ciphertext which is a sum of some of the $b_{i}$, we just multiply it by $r^{-1}\mod q$ and now we have a sum of some of the $w_{i}$, which is easy.
although it's solved, it's interesting to note that this cryptosystem is a kind of commutative square. namely, the transformation from private to public key is the same as from private to public ciphertext, given by multiplying by $r$. in the other direction of the arrows, it's summing / solving the sum problem.
I wonder if the other cryptosystems I know, RSA and diffie-hellman, are also somehow commutative. DH is symmetric though, so maybe not that. what about RSA?
spaces and geometries
- erlangen program says a space is a lie group? "Felix Klein proposed in his 1872 Erlangen Program that the classical geometries be considered as the properties of a space invariant under a transitive Lie group action."
- thurston's paper says this:
One way to think of a geometric structure on a manifold M is that it is given by a complete, locally homogeneous Riemannian metric. It is better, however, to define a geometric structure to be a space modelled on a homogeneous space (X, G), where X is a manifold and G is a group of diffeomorphisms of X such that the stabilizer of any point x in G is a compact subgroup of G. For example, X might be Euclidean space and G the group of Euclidean isometries. M is equipped with a family of "coordinate maps" into X which differ only up to elements of G. We make the assumption that M is complete. If X is simply-connected, this condition says that M must be of the form X/Gamma where Gamma is a discrete subgroup of G without fixed points.
- here modelled on means that the transition maps are in G? no, the coordinate maps, that's what he says
- each simply-connected open set (?) looks like X and has a map into M (the chart?) such that on the overlaps, it looks like (a restriction of) an element of G? or maybe the map is M -> X actually, that would make more sense, that that's what he says, okay, great.
- how is this equivalent to locally homogenous riemannian metric? this is probably the "relativization" of whatever theorem in Lie says that all manifolds have a riemannian metric?
- here the manifold M is the topology, and the homogeneous space (G, X=G/H) is the geometry. then each equivalence class of (G,X) structures on M is an element of the mapping class group Mod(M); finding the space Mod(M) is the classification question
- https://www.math.umd.edu/~wmg/icm.pdf
i remember i was confused a couple years ago because it felt like there were two or three different notions of "geometry". this was one of them, the thurston-style one, and then there was the "spherical/euclidean/hyperbolic", which i guess is this in 2D? but that was based on spaces of uniform curvature, which is not obviously this?
there was another one too, maybe just the notion of riemannian geometries. I guess under the erlangen program those would all be euclidean because that's the (X,G)? thurston's quote contradicts that, and actually that makes sense because G isn't all diffeos in the euclidean case it's only rigid transforms. so is the full group of diffeos anything? seems like the stabilizer of a point would probably not be compact; not sure how we're topologizing the group of diffeos though. i guess if we think of a diffeo as a jacobian matrix at each point, then you could put a metric on this as the sup over all points of some matrix difference?
convergence in measure of CDFs
notice for random variables we have two notions of convergence which treat them like measurable functions: convergence ae, convergence in probability ($\mathbb{P}(|X_{n}-X|>\varepsilon)\to 0$ for all $\varepsilon >0$) and one notion of convergence which doesn't care about their base space: convergence in distribution, which is pointwise convergence of the CDF at all continuity points.
- (ANS: NO) since the CDF is a measurable function, we can ask if convergence ae and convergence in measure (locally or globally) are meaningful for CDFs.
- best approach here is probably to start by constructing examples that converge/don't converge in this sense while converging/not converging in the standard senses
- (NOT POSSIBLE) we can ask if convergence in probability can be expressed in terms of CDFs, so that it becomes independent of the base space.
- one approach might be to turn the r.v. into a r.v. with base space $[0,1]$ using the inverse of the CDF, and ask if convergence in probability of these new equivalent r.v.s is the same as convergence in probability of the original r.v.s
- (MOOT) if we are successful on the above points and convergence properties of the original r.v.s translate to convergence properties of the CDFs, then it might be plausible to translate convergence properties "backwards" as well, in some way, so that convergence ae for CDF => some weird kind of convergence for r.v.s is analagous to the implication convergence ae for r.v.s => the same weird kind of convergence for a new and hopefully interesting type of object
- this is sketchier and should be attempted last
- see below
there's also a few notions of convergence of measures https://en.wikipedia.org/wiki/Convergence_of_measures, which notes that convergence in distribution is weak convergence of the pushforward measure.
it would be good to devise as many measures of convergence as you can think of and figure out how they relate and why certain ones are more useful than others.
///////////
okay I figured out (2), I guess. convergence in distribution is basically an ae convergence of a the MONOTONIZATIONS of the random variables. by this I mean consider the rv with the same law as $X_n$ given by $F_{n}^{-1}:[0,1] \to \mathbb{R}$; this is basically (up to some unimportant things like the shape of the domain and continuity points) the unique monotonic function with this law, and $F_{n}$ converging at all continuity points means the same thing as the monotonization converging at all continuity points.
but it should be clear that this completely destroys ALL the information present in the events of $X$ in the sense that we are treating every event $\omega$ in the probability space as exactly the same and interchangeable, which is generally a big loss of information. in other words, this sorting process which allows us to monotonize DEPENDS ON $n$ in a fundamentally unpredictable way; we can look at the monotonizations as carrying half the data of two maps; one being the monotonization map $M_{n}:\Omega \to [0,1]$ and the second being the random variables $F_{n}^{-1}:[0,1]\to \mathbb{R}$. Convergence in distribution completely determines the behavior of the second map, and says exactly nothing about the first map; we are free to choose it at will.
However, convergence ae and convergence in probability make strong claims about this first map; they imply that it converges, since they are convergence conditions on $M \circ F^{-1}$. I think the interpretation that each individual event will either settle down to a nice place in $[0,1]$ almost surely or with probability 1 should still hold.
As an example, consider the CLT for coin flips with values -1 and 1: each individual random walk never settles down, with its limsup and liminf diverging, $M_{n}$ will almost surely map it to every point of $[0,1]$ infinitely often, yet the second map settles down. Conversely, with the LIL we see that even since every event will have limsup 1 and liminf -1, $M_{n}$ should still map every single event to every point of $[0,1]$ almost surely. However, since the distribution concentrates all mass on a single point (zero, since the distribution is normal divided by $\log \log n$), $M_{n}$ has to map all the probability mass of $\Omega$ to zero, so it converges in probability as well. This is unusual; if the distribution had concentrated all the mass at more than one point, $M_{n}$ would have two choices in $[0,1]$ on where to map all the probability mass, and could switch it around all the time and not have to converge in probability. it's like some sort of forwards vs backwards probability mass: the preimage of each of the points might have mass 0.5, but going forwards there's no way to pick some set of events of half the mass which go to that point in any reasonable way.
this breakdown $F^{-1}\circ M$ also sheds some light on the other questions. If we want $F$ to converge in other weird senses, these give conditions on $F^{-1}$ but not on $M$, and so we can say nothing about convergence of the random variables in probability, in $L^{p}$, or almost everywhere, as these require $M$ to not be bad. so (3) is moot and for (1) the answer is that weaker notions of convergence of $F$ are probably uninteresting. Actually, since each $F_{n}$ is monotonic these might all end up being the same? try to check this.
- MSE says conv in measure -> conv pointwise at cont pts, proof looks intuitive: https://math.stackexchange.com/questions/869038/if-a-sequence-of-monotone-functions-converges-in-measure-does-it-also-converge
As a new question, we can ask 4) what do convergence conditions on just $M$ look like? can we make a table of convergence conditions on $M$ vs convergence conditions on $F$ and see which various convergence conditions on $M \circ F$ we might get? are any of the normal convergence conditions not possible to split into separate conditions on $M$ and $F$?
incomparable turing degrees
it seems like a non-forcing approach to incomparable turing degrees would be some sort of counting thing:
- assume for contradiction all Turing degrees are comparable
- then they form a total order
- each turing degree is above only countably many, by the definition of turing reduction
- (that is, given a representative of the turing degree A, the turing reductions using some machine with a given index e and oracle A, give all the turing degrees less than A. so there's only countably many since there's countably many machines.)
- each turing degree contains only finitely many subsets of $\mathbb{N}$, by the definition of turing reduction
- note that if A and B are both in a turing degree a, and C and D are in a turing degree b < a, then A < C, A < D, B < C, and B < D should all hold, ie there is a turing machine witnessing all four. so it's probably not like the error is in that turing degrees can have uncountably many representatives
- so we have an at most $\aleph_{1}$-sized union of countable sets containing every subset of $\mathbb{N}$. since there's $2^{\aleph_{0}}$ subsets of $\mathbb{N}$, and its consistent that $2^{\aleph_{0}}>\aleph_{1}$, we have the desired contradiction.
- but this is far too easy to be correct, so what's wrong?
categoricity of Th(N)
so PA has lots of countable nonstandard models, but maybe we can get rid of that by adding some axioms? (lowenheim-skolem says we have to keep some nonstandard models so let's only try to kill countable ones.) We can start by adding all the axioms, looking at the complete theory of true arithmetic Th(N). One would really hope this is $\aleph_{0}$-categorical, ie there's nothing elementarily equivalent to but not isomorphic to $\mathbb{N}$. but looking on the wikipedia theory says - this is false!!
there are $2^\kappa$ many nonisomorphic models for each cardinal $\kappa$! how cooked is that?? very sad. this is apparently what it means to be an "unstable theory".
wang tiles
question: what is the arithmetic complexity of the set of valid sets of wang tiles?
- we can code a set of wang tiles as a finite set of natural numbers as follows: (# of tiles, left side of first tile, right side of first tile, top side of first tile, bottom side of first tile, left of second tile, etc), and so we can computably encode a set of wang tiles into a single natural number
- a set of wang tiles is valid if it tiles the plane: if there is a function $\mathbb{Z}^{2}\to T$ where each edge matches the tiles on either side
- let's agree to make the distinction between a tileset and a tiling - there are many tilings corresponding to a given tileset
- there is no program to decide if a given tileset is valid (it is an undecidable problem), so this set can't be computable
- could it be c.e.? then there would be an program which outputs all valid tilesets eventually
- what about an upper bound on the complexity?
- $T$ is a valid tileset iff for all (n,m) in $\mathbb{Z}^{2}$ there exists $k$ such that ??
- can't directly do $\forall \exists$ bc need the witnesses to be compatible, and this is a global condition
- we want to say "there exists a function" - is this possible? seems like a second-order thing, since quantifying over functions probably lets us quantify over predicates, so we have to trick our way around this
- look at the complement instead
- could the set of invalid tilesets be c.e.? if arbitrarily large tilings imply an infinite tiling, then for sure invalid tilings can be ruled out in a c.e. manner
- and actually, counterintuitively, this is true, by a compactness argument, that we saw in class. it's probably worth writing it out again: the theory is the set of statements "the given tiling is a valid tiling of the $n \times n$ square". if a tiling tiles arbitrarily large squares, it is a model of all finite subsets of the theory, and then by compactness there is a model of the whole theory, a valid wang tiling.
- okay, so it's not the set of tiles which is satisfying the formulas here, but rather the tiling itself, the function.
- so if a set of tiles, not a tiling, is valid, it means that there exists a valid tiling for it
- if a tiling fails it fails in finite time, as shown by compactness, and this makes sense. but I'm not sure the same result is true for tilesets, since there's lots of tilings for a given tileset
- like some of the complexity probably comes from the fact that there's uncountably many tilings ($(<\mathbb{N})^\mathbb{N}$) and only countably many tilesets ($\mathbb{N}^{<\mathbb{N}}$)?
- so it seems like the fact that a tiling can be ruled out computably doesn't mean a tileset can be ruled out, otherwise it would be co-c.e.
- another look at the compactness proof
- the homework problem was that a there is an infinite tiling if there are tilings of every finite size
- so actually, the homework problem is about tilesets, not tilings. much stronger! maybe I did it wrong?
- the theory I used was not "the tiling is valid up to size $n$", it was instead "the $n$th edge does not break the tiling", framed as an implication "tile $t$ at (n,m) -> one of these other tiles is at (n+1, m)" and just lots of these
- but how can a tileset model this theory? isn't this the same as saying "there is a tiling of a bunch of edges"? tilings model the theory, not tilesets. you messed up.
- we want a formula which is modeled of a wang tileset if the tileset has a tiling of size $n \times n$. is this possible? yes, just say there exists $n \times n$ tiles such that the edge implications above are all true.
- maybe this works?
- anyway, given the homework problem, it's $\Pi_{1}$, and can't be $\Sigma_{1}$ since it's not computable, so it's $\Pi_{1}$-hard. is it $\Pi_{1}$-complete? yes, that's what the halting problem reduction we did in class was. if we can figure out tilings we can figure out which machines don't halt.
two funky things in computability theory
- when we talk about a set meeting dense sets, the first usage of "set" refers to a path through the tree (ie being in the set is turning left at that height) and the second usage of "set" is a set of finite strings (ie there is some bijection between naturals and finite strings)
- is there something fruitful to be had of this contrast? can this help me understand forcing?
- the typical set of reals (normal numbers, for example) is of full measure and also meagre, so these notions of smallness are orthogonal in some sense
- there's an MO post about this but I don't fully understand
- I think it said something about generic sets having this property. do 1-generic sets have this property? No other way around, it says 1-generics are comeagre and measure zero; 1-random is meagre and full measure
pareto principle
the pareto principle says "80% of the gains come from 20% of the effort"
my understanding is that the pareto distribution has PDF $t^k$ when $t \ge t_{0}$ and $0$ for $t < t_{0}$ with parameters $k<-1$ and $t_{0} \in \mathbb{R}$.
here 20% of the effort (time) would be infinite, so I would say the claim to hope for here is that on any interval not containing $t_{0}$, 80% of the mass of the PDF in the interval is contained in the first 20% of that interval. is that true for all $k$? any $k$? let's see.
$\int_{t_{0}}^{t_{1}} t^{-k}\ \mathrm{d}t = t_{1}^{-k+1} -t_{0}^{-k+1}$, so we want $0.8(t_{1}^{-k+1}-t_{0}^{-k+1})=(0.2t_{1}+0.8t_{0})^{-k+1}-t_{0}^{-k+1}$
I'll call $-k+1,t_{0},t_{1}$ as $u,a,b$ for convenience.
$0.8b^{u}-0.8a^{u}+a^{u}=(0.2b+0.8a)^{u}$
$(0.2b+0.8a)^{u}=0.8b^{u}+0.2a^{u}$
via desmos, for $u=-1$ we get a line at $b=16a$ in addition to $b=a$. so not quite everything. can we maybe set up a differential equation for this? don't need a diffeq actually, it's a functional equation
$0.8F(b)-0.8F(a)=F(0.2b+0.8a)-F(a)$
what sorts of functions satisfy this?
wikipedia has the $u$ that works at https://en.wikipedia.org/wiki/Pareto_distribution#Relation_to_the_%22Pareto_principle%22
a youtube comment
For a more intuitive sense of where the E8 lattice comes from, there is an exceptional series of polytopes discovered by Thorold Gosset over 100 years ago, known as the k_21 semiregular figures (but I like to call them kaleidoplexes) that leads up to it. These objects can be understood as being built from simplexes and orthoplexes, arranged around relevant facets in 1:2 ratio.
- The first kaleidoplex exists in 3D as the triangular prism, built from 2 triangles and 3 squares, with 1 triangle and 2 squares around each vertex. In the k_21 naming system, it is the k = –1 figure.
- The 4-kaleidoplex (k = 0) is the rectified pentachoron (1 of the 3 semiregular polychora), built from 5 tetrahedra and 5 octahedra, with 1 tetrahedron and 2 octahedra around each edge.
- The 5-kaleidoplex (k = 1) is the demipenteract (an alternated 5-hypercube), built from 16 pentachora (the 4-simplex) and 10 4-orthoplexes in the same 1:2 ratio around each triangular face.
- The 6-kaleidoplex (k = 2) is the first uniquely E-type figure, and is built from 72 5-simplexes and 27 5-orthoplexes in the same 1:2 ratio around each tetrahedral cell.
- The 7-kaleidoplex (k = 3) is built from 576 6-simplexes and 126 6-orthoplexes in the same 1:2 ratio around each 4-simplex facet.
- The 8-kaleidoplex (k = 4) is built from 17280 7-simplexes and 2160 7-orthoplexes in the same 1:2 ratio around each 5-simplex facet. Also, it has 240 vertexes, which "happens to be" the kissing number of the 7-sphere in 8D as well. (The 7-sphere is also the first sphere that permits exotic versions, interestingly enough.)
Meanwhile, the circumradius relative to edge length has been steadily increasing from √21/6 (~0.764) for the prism, through √15/5, √10/4, √6/3, √3/2, (notice the triangular numbers) and finally to 1 for the 8-kaleidoplex. Additionally, the 2 types of internal angles between sides have also been increasing.
This is where the E8 lattice finally appears. Rather than there being a proper 9-kaleidoplex, the 5_21 is actually a tesselation of 8D space whose vertex figure is the 4_21. It is still built from 8-simplexes and 8-orthoplexes in a 1:2 ratio around each 6-simplex facet, but the 2 angle types are now 180 degrees, hence why it can't "fold up" into a convex object in 9D space.
While most people are aware of the Platonic solids, these E-type objects––along with the F-type 24-cell in 4D, the G-type hexagon in 2D, the non-crystallographic H-type pentagonal polytopes in just 2, 3, and 4 dimensions, and the 4 infinite families of A-type simplexes, B-type measures (the hypercubes), C-type orthoplexes (technically also B, and not properly C until considering root systems), and D-type demi-hypercubes (which I prefer to call alterplexes)––fill out a larger classification structure for the possible types of symmetries, one which is directly related to the classification of finite simple groups, too.
huh? try to understand at some point
also see https://en.wikipedia.org/wiki/Exceptional_object#/media/File:Exceptionalmindmap2.png
these are comments from aleph 0's video on the E8 sphere packing
wang tiles in computable hierarchy
first code all the sets of wang tiles as natural numbers. that is there is some computable function $\text{tiles}:\mathbb{N}\to \text{Wang tiles}$ for which $\text{tiles}(n)$ is a unique set of Wang tiles up to isomorphism.
then, there is some subset $A \subseteq \mathbb{N}$ corresponding to wang tiles which can tile the plane.
what is the formula for $A$? what is its complexity?
$n\in A$ means:
- there exists (a tiling using the tiles given by n)
- there exists a function $\mathbb{Z}^{2}\to \text{tiles}(n)$ such that for all pairs of adjacent coordinates, the wang tiles are compatible
so if expressing a function doesn't add complexity then it's $\Sigma_{2}$. but it might add complexity? can you computably code a function Z2 -> tiles(n) as a natural number? this seems unlikely, since $|\mathbb{N}^\mathbb{N}| > \mathbb{N}$. so ??
instead code it as an infinite sequence of compatible finite tilings? "there is T" such that for all n there is a finite tiling Tn such that the restriction is Tn-1
measure zero vs meagre
https://mathoverflow.net/questions/43478/is-there-a-measure-zero-set-which-isnt-meagre
- they're similar: the Erdos-Sierpinski duality theorem says that assuming CH, there is an involution of $\mathbb{R}$ (bijection of order 2) which sends meagre sets to null sets and null sets are meagre.
- but they're also orthogonal: there is a decomposition of $\mathbb{R}$ into a meagre set and a null set
- ultimately the differences win out, bc Shelah proved to make all sets of reals measurable you need large cardinals, but you don't need large cardinals to make every set of reals have the Baire property (which corresponds to measurability)
- the construction of balls around rationals gives a measure zero comeagre set
- why is the intersection not just Q?
- 1-generic sets is comeagre and measure 0, 1-random reals is meagre and full measure
- read Oxtoby, Measure and Category
- the set of reals whose binary expansion is not "half zeroes and half ones" - this is measure zero and comeagre. in fact, the set of reals whose binary expansion has many zeroes is comeagre
- many "average" sets of reals have full measure and are meagre: normal numbers, Khinchin's constant stuff, numbers with irrationality measure 2, diophantine numbers
Justin's rep theory talk
- Tannakian philosophy: study $G$ via category $\mathrm{Rep}(G)$
- Tannakian reconstruction theorem: one can recover $G$ from $(\mathrm{Rep}(G), \otimes , \text{forgetful functor} \to \mathrm{Vec}_\mathbb{C} )$
- some finite dimensional rep theory facts
1)
Aren's rotation number talk
- $\mathrm{rot}(\phi,x)$ measures average winding speed of $X \to \mathbb{T}$ (here $X$ will be taken to be a compact metric space throughout)
- ergodic theorem for flow $T$: if $M$ is $T$-ergodic, then for all $g \in L^{1}$, $\lim {t} \frac{1}{t} \int{0}^{t} g(T^{s}x)\ \mathrm{d}s = \int_{X} g \ \mathrm{d} \mu$
- $C^{H}(X, \mathbb{T})$ is homotopy classes of $C(X,\mathbb{T})$
- abelian group, countable since separable
- $d(x,y)=\mathrm{dist}(x-y,\mathbb{Z})$
- $d(\phi(x),\phi'(x))<\frac{1}{2} \longrightarrow \phi \sim \phi'$ (homotopic)
- proposition: $\phi, \phi':X \to \mathbb{T}$ are homotopic iff there exists lift $\widetilde{\phi}:X \to \mathbb{R}$ st $\phi(x)=\phi'(x)+\pi(\widetilde{\phi}(x))$
- there's then a diagram with $\mathrm{ev}{0}:\Gamma \to \mathbb{T}$, $\phi:X \to \mathbb{T}$, and $\pi:\mathbb{R} \to \mathbb{T}$, and those three things to $\mathbb{T}$ are made into two triangles with $f:X \to \Gamma$ and $\widetilde{\mathrm{ev}{0}}: \Gamma \to \mathbb{R}$
- $\phi:X \to \mathbb{T}$ gives for $x \in X$, $\phi_{x}:\mathbb{R}\to \mathbb{T}$ via $t \mapsto \phi(T^{t}x)$ and the lift $\widetilde{\phi}_{x}: \mathbb{R}\to \mathbb{R}$
- definition: $\phi$ is differentiable (resp $C^{1}$) along the flow if $\widetilde{\phi}_{x}$ is differentiable (resp $C^{1}$), and $\partial \phi (x)=\frac{\mathrm{d} \widetilde{\phi}_x}{\mathrm{d}t}(0)$
- definition: the rotation number $\mathrm{rot}(\phi;x)$ of $\phi$ along the orbit of $x$ is $\lim {t \to \infty } \frac{\widetilde{\phi}{x}(t)}{t}$
- q: why not just $\lim _{t \to \infty } \frac{\phi(F^{t}x)}{t}$? why the lift? ans: dumb question, did I mean something different at the time?
- claim:
- $C^{1}$ functions across the flow is uniformly dense
- $\mathrm{rot}(\phi;x)=\int_{X}(\partial \phi) \ \mathrm{d}\mu$ for $\mu$-a.e. $x$
- theorem: suppose $(\mu, \phi)$ ergodic. then:
- $\mathrm{rot}(\phi; x)$ exists for $\mu$-a.e. $x$
- for all $\phi$ there exists $A_{\mu}(\phi)\in \mathbb{R}$ such that $\mathrm{rot}(\phi; x)=A_\mu(\phi)$ for $\mu$-a.e. $x$
- if $\phi \sim \phi'$ then $A_\mu (\phi) = A_\mu (\phi')$
- proof: apply ergodic theorem
- "special case": homeomorphisms $\mathbb{T}\to \mathbb{T}$, except discrete
- take double lift $\mathbb{R}\to \mathbb{R}$ (periodic)
- then $\mathrm{rot}(\phi; x)= \lim_{n} \widetilde{\phi} ^{n}(x)/n$ - warning: not independent of lift!!
- proposition: $\phi$ has a periodic point iff $\mathrm{rot}(\phi; x)$ is rational
- forward direction is boring, just do a computation
- reverse direction (wtf? how does average rational give infinitely many rationals?):
- suppose $\mathrm{rot}(\phi; x) = \frac{p}{q}$. then $\mathrm{rot}(\phi^{q}; x) = p$, ie we want to show $\mathrm{rot}(\phi^{q}; x)$ has a fixed point
- if no fixed point, choose the lift such that $\widetilde{\phi}(0) \in [0,1)$ and $\widetilde{\phi} (x)-x \in \mathbb{Z}$
- then $0< \widetilde{\phi}(x)-x <1$ for all $x \in \mathbb{R}$ and in fact (since continuous in $x$) $0 < \varepsilon < \widetilde{\phi} (x) -x \leq 1- eps < 1$
- then $\widetilde{\phi}^{n}(0)=\sum_{0}^{n-1}\widetilde{\phi}^{i+1}(x)-\widetilde{\phi}^{i}(x)$, so $\varepsilon \leq \frac{\widetilde{\phi}^{n}0}{n} \leq 1-\varepsilon$, which is a contradiction
- could ankify just this proposition + the intuitive definition of rotation number
talk 4/3
- in ergodic theory we want to study measure-preserving transformations of Polish space, $\mathrm{Aut}$
- $\mathrm{Aut}$ is a Polish group with the topology being weak convergence and the group law being composition of transformations
- we can without too much loss of generality fix it to be Lebesgue measure on $[0,1]$: as long as the probability measure has no atoms, apparently its $\mathrm{Aut}$ will be isomorphic as a Polish group to this canonical $\mathrm{Aut}$. this is uniqueness of standard Borel space?
- the hope is to classify them up to isomorphism, $T \cong T'$ if $T'=S^{-1}TS$ for some $S \in \mathrm{Aut}$
- unfortunately, this is too strong, since each isomorphism class is meagre in $\mathrm{Aut}$ and the equivalence classes form a very complicated set
- we can still prove some stuff about these isomorphism classes: halmos proved in 1944 that generically such an isomorphism class in Aut is weak mixing, and rokhlin in 1948 proved that it is not strong mixing
- here "generically" means for a comeagre set of isomorphism classes
- we will then take a coarser notion of equivalence: namely, consider the (cyclic) group generated by iterating a transformation $T \in \mathrm{Aut}$, ie $T, T^{2},\ldots$, then take its closure in $\mathrm{Aut}$, denoted $\langle T \rangle _{c}$. Two transformations are equivalent if their corresponding subgroups are isomorphic as Polish groups
- Ben later gave some motivation for this definition; if you think of these as matrices here we get polynomials in them and this is enough to determine the spectrum for some reason
- we now have some evidence that generically there is only one (!!) equivalence class, ie a comeagre set of isomorphism classes all generate the same group
- the "only reasonable candidate" for this group was L0, and there was a lot of work to prove this
- L0 is functions from $[0,1]$ (remember this is our Polish space) into the circle which are locally continuous?
- we can view L0 as an infinite dimensional torus: first consider constant functions, we get a circle; then break the domain into two pieces and have functions which are constant on each piece, we get a 2-torus; then break the domain into $n$ pieces and have functions which are constant on each piece, we get an $n$-torus
- each of these finite tori embed into each other (functions constant on both pieces) and so L0 is union of all these tori embedded in each other
- main result of the speaker: for generic $T$ in $\mathrm{Aut}$, $\langle T \rangle _{c}$ is not $L^{0}$, sad
- he used spectral methods analyzing a measure associated to each $T$, and got a contradiction because the measures associated to $T^{n}$ should be mutually singular by one result, and absolutely continuous by another result
- he used a technical lemma which is a version of Fubini's theorem, Kolmogorov-Ulam theorem. it says that $A \subseteq X \times Y$ is comeagre implies there are comeagrely (in $Y$) many slices $A_{y} \subseteq X$ which are comeagre in $X$. the converse is true under a mild hypothesis as well (I forgot what it was though).
- L0 is a Levy group; he in fact showed that no Levy groups can be the generic equivalence class
- since there are no other reasonable possibilities, ie if there is a generic equivalence class it really should probably be a Levy group, maybe there is more than one equivalence class but all the equivalence classes are Levy groups
- etadadialia
fractional quantum hall effect
- hall effect
- effect: when you put a wire with a current in a magnetic field perpendicular to the current, the wire experiences an electric potential difference in the third direction perpendicular to both the current and the field
- when you have current in a conductor, this generates a magnetic field in the counterclockwise direction
- however this effect is for applied magnetic fields to a current
- the lorentz force law (modified coulomb's law) is $\mathbf{F}=q(\mathbf{E} + \mathbf{v}\times \mathbf{B})$, which was empirically determined in late 1800s and is used as the definition of the electric and magnetic fields, since the law appears to be valid even with $\mathbf{v}$ close to the speed of light
- for a continuous charge distribution in motion we get force density $\mathbf{f} = \rho(\mathbf{E}+\mathbf{v}\times \mathbf{B})=\rho \mathbf{E} + \mathbf{J} \times \mathbf{B}$.
- this gives a force pushing on the charges in the wire, which explains the potential difference as follows:
- no B field => charges follow approximately straight paths between collisions with impurities, phonons, etc.
- with perpendicular B field => paths between collisions are curved so moving charges accumulate on one face of the material
- => leaves equal and opposite charges exposed on the other face, where there is a scarcity of mobile charges
- => asymmetric distribution of charge density across the Hall element, arising the perpendicular force
- quantum hall effect
- this is now for a 2D system in a MOSFET with large B field and low T
- the Hall conductance $\sigma$ undergoes quantum Hall transitions to take quantized values
- there is also spin hall effect and spin quantum hall effect, which require no B field, and in which spin gets polarized on opposite sides
- there is also anomalous hall effect and anomalous quantum hall effect, when the conducting material is ferromagnetic
- fractional quantum hall effect
metrization theorems
- urysohn's metrization theorem: every regular Hausdorff (T3) second-countable space is metrizable
- regular = can separate point and closed set, hausdorff = can separate two points. both imply that points are closed, so regular hausdorff = regular + T0 (all points are topologically distinguishable)
- restatement: separable and metrizable iff it is regular Hausdorff and second-countable
- proof idea: separable metrizable spaces are homeomorphic to subset of Hilbert cube $[0,1]^{\mathbb{N}}$, and so are regular hausdorff second countable spaces
- for a separable metric space, scale it so that all distances are less than 1, then encode every point as a sequence, and the metric agrees by construction
- for a regular hausdorff second countable space, we do some trick where we fix a countable neighborhood basis and consider all pairs (U,V) such that the closure of the first is contained in the second. then we have Urysohn functions $f_{UV}$ which are 0 on U and 1 on S\V, and we use those maps to send each point s to the sequence of $f_{UV}(s) \in [0,1]$ indexed over all U,V, giving a map from the space to the hilbert cube. you have to check this is an embedding and then use some sort of uniqueness of hilbert cube.
- Nagata-Smirnov metrization theorem: metrizable iff regular Hausdorff and it has a $\sigma$-locally-finite base
- this is the extension of urysohn to the nonseparable case
- bing's metrization theorem: metrizable iff regular Hausdorff and it has a $\sigma$-discrete basis
- $\sigma$-discrete = countable union of discrete = locally discrete = every pt has a neighborhood touching at most one
- smirnov metrization theorem: locally metrizable + paracompact iff metrizable
- the only obstruction to gluing local metric into global metric is that the space isn't too big
- paracompact = every open cover has a locally finite open refinement (like a subcover but we can take subsets; if we said "subcover" it would be the same as compact)
- frink's metrization theorem: metrizable iff every point has a countable neighborhood basis with some weird containment conditions
some notes on electricity and pauli's exclusion prinicple
prompted by this blogpost from HN: https://lcamtuf.substack.com/p/but-good-sir-what-is-electricity
electrons flow in a wire is certainly something to investigate - our electrical and plumbing systems are to me the two pillars of modern amenities. i've phrased that badly, but in any case, electrons in metals. let's do it.
- we recall electrons live in quantized orbitals, which are solutions to the schrodinger equation that square to a probability distribution
- in the atomic case these solutions appear to be described by four quantum numbers: the principal quantum number $n$ which corresponds to the size of the orbital, the angular momentum quantum number $l < n$ which corresponds to the shape of the orbital, the magnetic quantum number $-l < m_l < l$ which corresponds roughly to the direction of the orbital, and the spin magnetic quantum number $m_s$ which governs the two electrons in each orbital
- then the pauli exclusion principle says that the wavefunction is antisymmetric upon switching fermions, so that if two electrons had the same quantum numbers, we could switch them and get that $\Psi=-\Psi$ and that's bad, so we get one unique quantum numbers for each electron
- main rabbit hole: why do fermions behave this way? why does all known matter obey either fermi-dirac statistics or bose-einstein statistics?
- atoms bond by "sharing" or "stealing" electrons - what this means is that there's a complicated solution to the wavefunction involving orbitals that go across atoms, and it's still discrete and finitely many but harder to describe. this is the realm of molecular orbital theory
- this mostly only involves valence electrons, since the inner shells almost always have energetically favorable single-atom orbitals
- in metals, however, the solutions blend out so much that the easiest way to model electrons is as a gas!
- applying a charge on one side is like increasing the pressure there; the pressure wave travels quickly to the other side (like the speed of sound for electrons instead of air molecules) despite the electrons not moving so fast
- we have four numbers: the speed of the field, the speed of the wave, the drift velocity of the electrons, and the thermal velocity of the electrons
- the electric field propagates at the speed of the wavefront, the velocity factor, typically 50-99% of the speed of light
- in a vacuum the factor is 1, in air it's about 0.9997.
- the electromagnetic waves propagate much slower. the velocity in a low-loss dielectric is given by $v = \frac{1}{\sqrt{\varepsilon \mu}}=\frac{c}{\sqrt{\varepsilon_{r}\mu_{r}}}$. $\varepsilon_{r}= \varepsilon/ \varepsilon_{0}$ is relative permittivity to the vacuum permittivity $\varepsilon_{0}$ and $\mu_{r}$ is relative magnetic permeability to the vacuum $\mu_{0}$.
- permittivity describes how polarizable a material is; if you apply a field $E$ you get a electric displacement field = electric flux density of $D = \varepsilon E$
- permeability is analogous for the magnetic field, the ratio of induction to the applied field: $B = \mu H$
- in copper at 60 Hz, this is about 3.2 m/s
- i'm confused by https://en.wikipedia.org/wiki/Speed_of_electricity
- the drift velocity is simpler, it's the average velocity of the electrons, and it's very slow. for 2mm diameter copper at 1 amp it's around 20 micrometers per second. so at 60 Hz for example, it only drifts around 0.2 micrometers before changing direction
- drift velocity is proportional to current, which in resistive materials is proportional to the electric field: $u=\mu E$, where $\mu$ is called the electron mobility. this is the basis for Ohm's law
- the thermal velocity = fermi flow velocity = average velocity of an individual electron, is around 1570 km/s at room temperature
- how was this calculated??
on to the main rabbit hole
- quantum mechanics postulates that particles which are the same particles are indistinguishable
- this means that exchanging any two of them should not affect the quantum state, which is a vector in a big hilbert space (the wavefunction)
- however, since we are in complex projective space, we can have a phase change
- we also need the fact that physical laws do not change under lorentz transformations, by special relativity
- microcausality = idea that spacelike-separated fields either commute or anticommute; needs a relativistic theory with a time direction for 'spacelike' to make sense
- we define bosons and fermions to be particles whose wavefunction is symmetric / antisymmetric under exchange
- theorem: integral-spin particles are bosons with Bose–Einstein statistics, and half-integral-spin particles are fermions with Fermi–Dirac statistics.
- no elementary proof known despite the statement being elementary
- why can't spin take on other values?
- related to representations of SO(3) and SU(2) - fermions, half-integer spins, are spinor representations of SU(2), while bosons, integer spins, are boson representations of SO(3)
- SU(2) is double cover of SO(3), corresponding to possible sign change. the particles should be invariant under rotation so that's why we get these groups?
- the quantum states transform under the action of these groups, so group action on vector space -> representations
- boson representations: spherical harmonics form a complete basis for the representations of SO(3). spherical harmonics are functions on the sphere describing angular momentum of wavefunction, have an integer parameter. we only know spin 1 elementary bosons, except higgs boson with spin 0, and graviton is predicted to have spin 2 if it exists.
- fermion representations are combinations of the spin 1/2 "spinor representation" flipping sign upon a 360 degree rotation, plus higher dimensional representations invariant under 360 degree rotation. spin s corresponds to dim 2s+1 representation. only spin 1/2, 2D representations of SU(2) are known to correspond to elementary physical particles
- physical particles are understood as excitations of physical fields
- quarks and leptons are fermions with spin 1/2, which is why matter takes up space - degeneracy pressure of electrons
- fermi-dirac and bose-einstein statistics
- we have a distribution describing the average number of particles in a given state, in both cases
- the formulas look very similar and both distributions approach the Maxwell-Boltzmann distribution in the limit of high temperature and low particle density
- wikipedia has derivations of the formulas based on canonical ensemble and microcanonical ensemble approaches\
why are standard borel spaces determined by their cardinality?
-
inspired by https://golem.ph.utexas.edu/category/2025/02/universal_characterization_of.html
-
definition: a standard borel space is a measurable space $(X,\Sigma)$ that looks like the Borel sigma-algebra on top of some Polish space
- a polish space is just a separable completely metrizable space (metrizable with a complete separable metric)
- so a standard borel space is a space that when given some metric it becomes complete separable metric space in such a $\Sigma$ is its Borel $\Sigma$-algebra
-
why do people care about Polish and Borel spaces?
-
separability is a size restriction
- separable metric spaces are limited in their cardinality; they must be second countable since balls of radius 1/n around the separating set form a basis, and similarly any second countable space is separable since we can take a point from each basis element (don't need metric in that direction)
-
(Kechris 15.A)
- the continuous image of a Borel space might not be Borel (eg map it be the same space with the trivial sigma-algebra? i think this is not a valid counterexample since we probably assume borel sigma alg on the codomain. maybe )
- thm (lusin-souslin): if u take a subset $A$ of a Polish space $X$ and a continuous $f:X \to Y$, and $f|A$ is injective, then $f(A)$ is Borel
- every polish space is contained in baire space so let's take that for X
- use a Lusin scheme??
- corollary: if X was Borel and $f$ was a borel map, then $A \cong f(A)$, since we can apply the theorem to the projection of $X \times Y \to X$ and $(A \times Y)\cap \mathrm{graph}(f)$
-
(Kechris 15.B) (the isomorphism theorem) Let X,Y be standard Borel spaces. Then X,Y are Borel isomorphic iff card(X) = card(Y). In particular: any two uncountable standard Borel spaces are Borel isomorphic.
- a Borel isomorphism is ??
- it's pretty easy to see that finite and countable Borel spaces are determined by cardinality to be the sigma alg of all subsets, since the only metric that works is the discrete metric
- why can't Z + pt at infinity work? what rules out limit points generally in the countable case? it is complete, it is metrizable as a subset of the circle, and it is separable since it is countable. so it's polish. then the
-
also encylopedia of math says that standard borel spaces are compact; how is Z or R compact?
don't forget to read the CT thing in the Baez link
- abstract: the category SBor of standard Borel spaces is the (bi-)initial object in the 2-category of countably complete Boolean (countably) extensive categories. This means that SBor is the universal category admitting some familiar algebraic operations of countable arity (e.g., countable products and unions) obeying some simple compatibility conditions (e.g., products distribute over disjoint unions). More generally, for any infinite regular cardinal κ , the dual of the category κBoolκ of κ -presented κ -complete Boolean algebras is (bi-)initial in the 2-category of κ -complete Boolean ( κ -)extensive categories.
- baez also says: Tom Leinster’s program of revealing the mathematical inevitability of certain traditionally popular concepts. I forget exactly how he put it, but it’s a great program and I remember some examples: he found nice category-theoretic characterizations of Lebesgue integration, entropy, and the nerve of a category.
scissors congruence sum talk
- dehn-hilbert scam
- resize rectangle: figure 5 in either of these papers - the magic ingredient of the basic proof
- stable SC, Zylev
- Caligan: $0\to \mathcal{P}(\mathbb{E}^{2})\overset{P}{\hookrightarrow }\mathcal{P}(\mathbb{E}^{3}) \overset{\mathcal{D}}{\to}\mathbb{R}\otimes \mathbb{Z} \frac{\mathbb{R}}{ \mathbb{Q} \pi} \overset{\mathcal{J}}{\to } \Omega{\mathbb{R}/\mathbb{Z}} \to 0$
- $\mathcal{P}(\mathbb{E}^2)$ is just $\mathbb{R}$
- $P$ is prisms
- dehn map $\mathcal{D}{p}:=\sum{e}l(e)\otimes \alpha(e)$
- $\Omega_{\mathbb{R}/\mathbb{Z}}$ is Kahler differentials of circle
lizzie's research
- problem: describe 2-4 gluon scattering
- can do this via a 100-page long feynman diagram, which magically at the end cancels out to single rational equation
- tells us that feynman diagrams are the wrong framework with which to approach this problem
- alternative framework: semialgebraic (described by inequalities in $\mathbb{C}^n$ instead of equations) subset of Grassmanian called amplituhedron
- scattering amplitudes much easier to compute using this
symbolic dynamics pitch from albert at pcmi
- topology, metric space: SFT
- space of bi-infinite path in some groups
- incidence matrix
- information theory invariant - entropy
- introduction to symbolic dynamcis and coding by lind and markus
- quantum watrous
pcmi talk from scott aaronson
- bernstein-vazirani
- hadamard all the things = fourier transform over $\mathbb{Z} _{2}^{n}$, dual is $\mathbb{T}^{n}$?
- black holes as random quantum programs
- boixo-martinis 2019, 0.1%, 10,000 yrs -> 4 months
- MIP*=RE (2020)
- quantum algos for classical problems like random walks, singular values, exponential speedup for linear algebra problems; also for quantum problems
- (bell, horne, ...)
- clavset, zeilinger, verification of quantum devices (magic square game)
- given description of random test, undecidable whether there exists quantum-classical gap
- connes embedding conjecture on C* algs (1976) falsified
- non-sofic non hyperlinear groups exist?
link from hopf algebras to knot theory
- brief notes on an exposition by palani
- start with a Hopf algebra, for instance a quantum group
- a Hopf algebra is a (unital associative) algebra and compatibly a (counital coassociative) coalgebra with an antipode map
- you should think of comultiplication as like a generalized version of the diagonal map? on the algebras themselves it's like a tensor product
- the antipode map from the Hopf algebra to itself is an antihomomorphism (reverses the order of multiplication) such that the hexagon diagram commutes
- the antipode map determines its bialgebra up to isomorphism, so being a Hopf algebra is a property of a bialgebra, namely admitting an antipode, not really additional structure
- example: group algebra with comultiplication is diagonal map / tensor product, counit is $g \mapsto 1$, antipode is inverse map
- more examples: functions $K^G$ from finite group to field, representative functions on compact group, regular function on algebraic group, tensor algebra $T(V)$, universal enveloping algebra $U(\mathfrak{g})$, sweedler's hopf algebra $H=K[c,x]/(c^{2}=1,x^{2}=0,xc=-cx)$.
- these examples are either commutative or cocommutative (?), but we can "deform" or "quantize" these to get a special type of hopf algebra called a quantum group. no agreed-on formal definition yet, but the idea is that a standard algebraic group is well described by its standard hopf algebra of regular functions, while the deformed hopf algebra describes non-standard / quantized algebraic group (so not an algebraic group). instead of trying to manipulate these weird group-like things, work with their hopf algebras instead.
-
categiry if representations = ribbon category (morphisms are isotopy classes of framed links, morphisms = tangles) = braided monoidal category (Vect-enriched)
- braiding is matrix
- $R:V \otimes W \to W \otimes V$ satisfies some axioms
-
add trace at each pt
- knot invariant
- put the picture of gru (step 5: profit??)
- quantum SU2 gives rise to the Jones polynomial
Q: What does the set of "maximally measurable" functions look like?
- Given a $\sigma$-algebra $S$ Define a maximally measurable function $f:\mathbb{R}\to \mathbb{R}$ as one such that $\sigma(f) = S$, where the codomain is given the Borel $\sigma$-algebra.
- Find a "good" characterization of the maximally measurable functions of a given $S$
- Note that this is a subset of $S$-measurable functions (by Doob-Dynkin lemma), and we want the preimages of open sets to generate $S$
- Examples
- trivial sigma algebra: the generators is the same as the measurable functions, all constant functions
- finite sigma algebras: functions are constant on each 'piece' but the generators cannot be the same on any two pieces
- borel sigma alg: most surjective functions work? eg f(x) = x, odd polynomials
- has to like be non locally constant on some part of the domain for every part of the codomain
- do all surjective continuous functions work?
- try to find a surjective counterexample. has to be discontinuous,
- TODO
- lebesgue sigma alg: it's not clear if this is nonempty. borel measurable functions will generate at most borel sigma alg, so not borel measurable. somehow has be an indicator of like every single lebesgue set at the same time, perhaps integrate over some infinitesimal versions of them?
- you can easily get countably many indicators fit in
- how many lebesgue and borel measurable sets are there anyway? if there's more lebesgue measurable sets you lose by counting
- borel is apparently continuum
- lebesgue is apparently 2^c, so ok, you lose
LEDs
- light is produced by electrons filling electron holes, this recombination is called electroluminescence
- probably a bit different bc metals have energy bands, not specific orbitals
- semiconductor band gaps can be direct or indirect
- there is a conduction band and valence band. the min energy state in each is characterized by "crystal momentum" k-vector in brilloun zone (basically fundamental domain)
- indirect if these are two different vectors and direct if they are the same (the crystal momentum of electrons and holes in each of the two bands, that is)
- direct = u can directly emit a photon, indirect = can't do this bc you need an intermediate stage to faciliate the change of momentum
- in the LED case the free e- are in the conduction band and the holes are in the lower energy valence band
- to faciliate the change in crystal momentum, at the p-n junction the electrons release heat (silicon, germanium) and light (gallium arsenic phosphide = GaAsP, gallium phosphide = GaP)
- if the GaAsP or GaP is translucent then the light escapes and we have an LED. Orginal LEDs were just GaAs.
- conflicting info - next paragraph says Si and Ge are indirect band gap and so has non-radiative transitions, while LED materials are direct band gap. Perhaps direct band gap means the crystal momentum vectors have the same direction with not necessarily the same magnitude? then the size of the direct band gap is the wavelength of light emitted
- typically, you deposit a p-type layer on an n-type substrate, though the opposite is seen. commercial LEDs, especially the GaN and InGaN seen in blue LEDs, use a sapphire substrate.
- problem!! uncoated semiconductors have high refractive index; can result in total internal reflection (like an internal mirror), so you lose energy to heat. solved via using a more complicated semiconductor shape to minimize bad angles
- efficiency droop = luminous efficacy decreases as current increases. surprisingly, this is less potent at high temps (though that does kill lifetime). in 2007 the mechanism was discovered to be Auger recombination = rare 3 particle process where the energy used to fill a hole is used to eject another particle. solution is to use multiple LEDs in one bulb.
- new work is going into quantum dot LEDs which are small enough to take advantage of QM effects. they emit a nice tight gaussian distribution of wavelengths, allowing building more accurate color displays (like 97% of rec 2020 color gamut)
- OLED = organic LED, same principle with organic semiconductors (pi bonded molecules with CHONS atoms, appropriately doped)
moduli spaces
definition by example
think about grassmanian, we have an object and a bundle over that object (each point of Gr represents a subspace so the bundle is the disjoint union of all subspaces)
moduli space M is the base space for a universal family U
definition 1
- (U, M) is universal if any family of schemes T over any base space B is the pullback of U along unique B⟶M
- this helps me understand pullbacks! a pullback is completing the analogy in ___ : X :: Y : Z (the pullback over X⟶Z and Y⟶Z)
- moduli space sits under universal family in a way that represents (not category-theoretically) all other bundles
- question: is this T⟶B bundle thing representing "all" bundles or all bundles of the same "type" as U⟶M? like for the grassmannian is it all vector bundles? all VBs of fixed dim (ie a subspace of Gr)? or some broader class of scheme maps
definition 2
- we are given some arbitrary functor from schemes to sets, which we interpret as sending base B to all possible families over it.
- intuitively, we want to send to the best such guy, because we want a base M that gets sent to a universal family U. this means that Hom(U,M) is a factor in some sense of every other Hom(FB,B)?
- the definition given is that Hom(–,M) is naturally isomorphic to F, and this is the definition of M representing F.
- they also say the family U = FM is the family corresponding to id_M in Hom(M,M)
- I need to review this in the context of Yoneda I suppose
todo / qs
- try to work out the second definition for the grassmannian
- why is Hom(–,GrV) naturally isomorphic to a functor sending B to families with base B?
some finite group theory
derived series = composition series made out of taking commutator over and over; $G_{i+1}= [G_{i},G_{i}]$, $G_{0}=G$.
- remarkably, this gives for solvable groups (composition series built out of abelian quotients) an intrinsic way to get the composition series.
- for non-solvable groups it might not terminate in the identity
- whatever it stabilizes on (for infinite groups we may need to continue transfinitely to get stabilization) is called the perfect core or perfect radical of the group, its largest perfect ($[G,G]=G$) subgroup
- this is a big generalization of the commutator subgroup is normal and hence simple groups are perfect
- a group whose perfect core is trivial is called hypoabelian. the quotient of $G$ by its perfect core is the hypoabelianization of $G$
- every free group and every residually solvable group is abelian
need to read proof of jordan holder
some properties of groups
- let $X$ be some property like finite, solvable, nilpotent, p-group, abelian, or cyclic
- residual
- a group $G$ is residually $X$ if it "can be recovered from groups with property $X$"
- for every $g \neq e$ there is a hom from $G$ to an $X$-group sending $g$ to non-identity
- categorically, $g$ is residually $X$ if it embeds into its pro-$X$ completion = inverse limit of all morphisms $G \to H$ for some $X$-group $H$
- virtual
- a group $G$ is virtually $X$ if if it has a $X$-subgroup of finite index
- pro
- a group $G$ is pro-$X$ if it is a inverse limit of an inverse system of $X$-groups
- cyclic -> abelian -> p-group (Q8) -> nilpotent () -> solvable (S3) -> ...
- does this ordering get preserved under virtual / pro / residual? reserved? does anything collapse?
more on perfect groups
- ore's conjecture
- we know that normal groups satisfy $[G,G]=G$. $A^5$ satisfies the stronger property that every element in it is a commutator, not just generated by commutators
- ore conjectured this for all finite simple groups; the only known proof (2008) relies on the classification
-
grun's lemma
other types of subgroup series
- recall a subnormal series has each term normal in the next, a normal series has each normal in the whole group, and a composition series is a maximal subnormal series
- chief series
- maximal normal (not subnormal!) series = composition series under the action of inner auts
- always exists, and jordan holder extends to give the same uniqueness of factors
- doesn't seem so useful?
-
upper central series
-
lower central series
- derived series
- lower / upper fitting series
- idea is to measure how non-nilpotent a solvable group is
- maximal subnormal series with nilpotent quotients, pretty straightforward
- eg 1 < Z2 < S3; 1 < Z2 < K4 < S4 (normal subgroup of Sn is union of conjugacy classes; there are probably better examples where the quotients are not cyclic)
- upper series is defined as repeatedly quotienting maximum normal nilpotent subgroup (the "fitting subgroup")
- lower series is defined as repeatedly taking "nilpotent residual" $\gamma_\infty(G)$ = intersection of the lower central series
- similar to derived series giving characteristic composition series, the upper/lower fitting series consists of characteristic subgroups
- upper / lower p-series
- $O_p(G)$ is the p-core = largest normal p-subgroup = normal core of every Sylow p-subgroup.
- $O_p'(G)$ is the p'-core = largest normal p'-subgroup (order coprime to p). the "core" $O(G)$ is the 2'-core.
- The p',p-core $O_{p',p}(G)$ is p-core after quotienting by p'-core = largest normal p-nilpotent subgroup (that's right, p-nilpotent and p-solvable are things; same as normal definition but using the p-series instead of abelian composition series)
- use the p'-core and p',p-core to begin upper series; lower series is defined using focal subgroup theorem stuff (todo?)
- apparently important in modular representation theory: p-core is intersection of the kernels of the irreducible representations over any field of char p, and p'-core has an interpretation in terms of complex irreducible reps.
some probability theory
stopping times
- you have a filtered probability space (filtration $\mathcal{F}_t$ on the sigma algebra)
- then we want a random variable $X$ (measurable function from events) taking values in the filtration (domain of $t$, eg $\mathbb{N}\cup {+\infty}$) so that ${X\leq t}$ is measurable with respect to $\mathcal{F}_t$.
- answers question, given these events, what time do we stop at?
- sometimes (often?) there is a requirement that we put zero measure on ${X=+\infty}$.
- alternatively, we can formulate the question, given this sequence of events, what time do we stop at?
- then we get a sequence of random variables $(X_t)$ taking values in ${0,1}$ (stop or continue) each of which is $\mathcal{F}_t$-measurable
- more generally, an adapted process is a sequence of random variables (stochastic process) each of which is measurable with respect to the corresponding thing in the filtration
- stronger version is a progressive process, for which we have time measurable and not just event space: $(s,\omega)\mapsto X_s(\omega)$ should be $\mathrm{Borel}([0,t]) \otimes \mathcal{F}_t$-measurable for $s \in [0,t]$ for all $t$.
- example
- "play 5 games" is $X(\omega)=5$ in the first formulation and $X_5(\omega)=1, X_{\neq 5}(\omega)=0$ in the second. constant function is always measurable so filtration doesn't matter
- "play until run out of money or 500 games" is stopping rule
- "play until you get the most money you will ever get" is not stopping rule, not measurable wrt filtration
- "play until you double your money" usually has positive probability on $\infty$ so depends on definition
- min, max, and sum of stopping times is stopping time
- all stopping times are hitting times. from the second definition you are just hitting ${1}$?
hitting time
- first hit time, for stochastic process $X_t$ with filtration on $T$, and for a set $A$, is the random variable $\tau_{A}=\inf{t\in T\ |\ X_{t}(\omega) \in A}$.
- first exit time is first hit time for complement of $A$, also denoted $\tau_{A}$ unfortunately
- first return time is first hit time for some fixed singleton set like the origin
- Debut theorem: every hitting time (assumptions: measurable set, progressively measurable process, right continuous complete filtration, universally complete underlying probability space) is a stopping time. proof is complicated, uses analytic sets.
càdlàg function = RCLL = corlol
- continue à droite, limite à gauche = right continuous with left limits = continuous on (the) right, limit on (the) left
- collection of all cadlag functions on a space = skorokhod space
- important for stochastic processes?