I don't see any confusion at all. The first few paragraphs of the link say it's based on PyMC, which itself appears in your link under "Existing probabilistic programming systems". So it's a book that's a practical guide to using one of the systems you reference.
As a researcher with MCMC interests, I agree with the grandparent post. Used as technical jargon "probabilistic programming" tends to mean: specify a model using a programming language, a compiler then works out how to do inference in that model, and writes the inference code for you.
PyMC is a toolkit that makes it easier to write inference code for a wide range of models, but isn't as automatic as the field of probabilistic programming promises.
As it says on the linked site, it lists more than probabilistic programming systems, and PyMC falls into the latter categories of things it lists:
Below we have compiled a list of probabilistic programming systems including languages, implementations/compilers, as well as software libraries for constructing probabilistic models and toolkits for building probabilistic inference algorithms.
PyMC follows the library approach to probabilistic programming rather than inventing yet another application specific language that only a niche of developers will be willing to spend time to learn.
Despite not introducing a new language syntax or DSL, PyMC is still probabilistic programming in the sense that you have python variables that represent random variables with prior distributions and then use those to derive new random variables by using deterministic python expressions or functions. Finally you plug an inference engine to be able to invert the execution order and derive a posterior distribution on the unobserved variables.
I tend to agree with Ian that it's confusing to conflate probabilistic programming and libraries that support Bayesian inference. In PyMC variables represent random variables, but these variables can't be used with Python constructs like conditionals and loops. Python is used to construct a DAG, which is then executed.
I think a better definition of probabilistic programming languages are languages where you can replace any variable of type T with Random<T>. The line isn't entirely clear, but library approaches in languages like Python don't fit, since they can't handle control flow. Bugs/JAGS/Stan might qualify, although they are very limited declarative languages. Their motivation is primarily to have a compact modeling syntax, not a real programming language.
There's no need for probabilistic programming languages to be some esoteric DSL. You can convert languages like Python or Matlab into a probablistic programming language with a lightweight compiler transformation: http://www.mit.edu/~wingated/papers/lightweight_pp.pdf. Actually doing inference efficiently, however, remains as challenging as always.
>it's confusing to conflate probabilistic programming and libraries that support Bayesian inference
But it's a generic term so you could say the same about functional programming or logic programming, both of which can be done in Python even if there are more advanced or integrated systems elsewhere. I don't really think most people care, besides perhaps PL researchers, at which portion of the stack things are happening at or being optimised; if you are using the relevant mathematics and statistics that's what you are doing. I think people are playing semantics to say it only means one thing when it's obviously used in a general way and a sometimes in specific way.
The bottom line is the guy who wrote the book thinks it's probabilistic programming, ogrisel does, I do, and the people who run http://probabilistic-programming.org/wiki/Home seem to be referring to it as probabilistic programming as well. I don't buy Ian's argument that it's part of some latter type of category on the site, PyMC is directly linked in a section titled "Existing probabilistic programming systems". They use "as well as" to link the two groups so either the first is "systems" and the rest are still "probabilistic programming" just without "systems" or they are all "probabilistic programming systems" if "as well as" is operating in that way. The arguments against this seem to be splitting hairs and playing semantics far too much when n-grams regularly have more than one meaning. Indeed it's amusing to see probabilistic people arguing for one interpretation rather than saying that there could be more than one and it depends on context (an NLP program trying to disambiguate the meaning of a given n-gram would look at other words present, topic models for the document, et cetera).