Interesting insight why applying a log transform often makes data normally distributed: Pretty much all laws of nature are multiplications (F=ma, PV=nRT, etc). If you start with i.i.d random variables and multiply them, you get log-normal data by virtue of the central limit theorem (because multiplications are additions on a log scale; and the CLT is also somewhat robust to non iid-ness). Thinking of data as the result of a lot of multiplications of influential factors, we thus get a log-normal distribution.
The CLT does not require iid (independent, identically distributed) variables. Just independent and having a variance, plus some rather weak condition on slightly higher orders. Otherwise the variables can be quite different from each other.
No but it holds more generally. Taking the log of data tends to make it look "more correct" even when it's not theoretically justified, and this can lead to very wrong conclusions.
Matt Parker says it's because that's how humans are naturally inclined to think, and used the midway point between 1 and 9 to illustrate. We'd say five but "children and others not exposed to math would say 3" and then gave some explanation with beads or coins. It didn't make sense to me but I do know that if a graph is log scale I need to actually look at it harder to make sure they're not trying to pull a fast one on us here folks.