Some informal notes on forcing in NF and NFU
1. The forcing construction in NFU
Let <= be an s.c. partial order with domain P (and let >= be its
converse). Since we need infinite s.c. partial orders, we need to
assume AxCount; this seems harmless, since AxCount is obviously true
:-)
For p,q in P, I write p <= q when q gives more information than p. This
may be the reverse of the standard convention.
I understand forcing constructions by thinking about Kripke models of
intuitionistic logic. There is a difference in the treatment of names
of objects: all names that we will ever use are present at the top
level, and the question as to whether two names are equal may be settled
differently in different "possible worlds" (once settled, it remains
settled, of course). Also, of course, we need our logic to be classical,
so we take care that our connective definitions and the semantics of
our atomic sentences work correctly. All of this is true of standard
forcing techniques; there is no novelty about any of this!
It is possible to do forcing in NFU by simulating the standard forcing
constructions in the interpretation of (a fragment of) Zermelo set theory
in the isomorphism classes of well-founded extensional relations. The same
simulation can be carried out in NF, but creates a lot of urelements.
The development here takes a different tack, though for NFU the results
one can prove are not different. The advantage for NFU is philosophical;
the construction is simple and natural in the NFU context. I defined this
construction while I was on sabbatical in Cambridge; it is modelled on
Marcel's weak extensional collapse of (NF - extensionality) to NFU.
We define a special class of subsets of P, which we can think of as
"classical truth values". RO(P) = the set of all subsets A of P
such that
i. for any p in A and q >= p, q is in A.
ii. If p in P has the property that for every q >= p, there is
r >= q such that r is in A, then p is in A (this ensures that
we have a "classical" truth value).
We interpret all sets in our original model of NFU as functions from V
to RO(P), by defining A[x] as the unique a in RO(P) such that (x,a) is
in A, or as the empty set (which is an element of RO(P)) if there is
no such unique a. Note that the relative type of A is one higher than
the relative type of x in A[x]. (It is important to note here that I
use a type-level pair (definable in NF, provably existing in
NFU + Infinity + Choice), so the type of a function is one higher
than its arguments and values, instead of three higher).
We can now interpret every element of our universe as a "name"; if p
is an element of A[x], we would like to say that p forces "x is in A".
But we can't, for reasons we will see shortly.
We begin by defining equality of names. This is defined exactly in
line with the intuition above: p in P forces "A = B" (we write this
p |- A = B) just in case for all x, p is in A[x] if and only if p is in B[x].
Notice that the types of A and B are the same in this definition, and
the type of p is two types lower (but is "adjustable" because P is s.c.).
Now we see the problem. The identity criteria for "elements" x of A
under the condition p (objects x such that p is in A[x]) are the
identity criteria of the original model of NFU, not the new identity
criteria of the forcing interpretation. The obvious way to change
this (which is used in standard set theory) would involve induction on
membership, which is impossible here. What we do instead, following
Marcel's idea of "weak extensional collapse", is treat "names" which
do not respect the identity criteria of the forcing interpretation as
urelements.
We say that p |- set(A) (p forces "A is a set") if and only if for
every q >= p, x such that q is in A[x] and y such that q |- x = y,
it is also the case that q is in A[y].
This says that if x is intuitively an "element" of A under condition q
at least as strong as p, and q forces x=y, then y is also intuitively an
"element" of A under condition q.
This definition is only stratified because P is s.c. In q \in A[x], q
is one type lower than x, while in q |- x=y, q is two types lower than
x. But the fact that P is s.c. allows the type of q to be adjusted
freely.
Now we can define our official membership relation:
(p |- x is an element of A) iff p is an element of A[x] and p |- set(A).
Notice that x is one type lower than A, just as in "x is an element of
A".
Additional details:
define p |- x & y as p |- x and p |- y
define p |- (Ax . phi) as (Ay. p |- phi[y/x])
(this is different from the usual definition in Kripke
models because all names are present in all worlds).
define p |- ~phi as (Aq >= p. ~(q |- phi))
Then define all other connectives using their definitions from these
connectives in _classical_ logic.
Notice that this is only defined for concrete sentences (with instantiations
of variables by arbitrary names); |- is certainly not a set!
Everything has classical semantics (is equivalent to its double
negation) because atomic sentences have classical semantics (due to
the role of RO(P) in the definitions) and we use only classical
connectives and quantifiers.
Equality has the right logical properties because all atomic sentences
satisfy the correct logical properties for equality; the rest follows
by induction on the complexity of sentences.
For standard reasons, if p |- phi and psi is deducible in classical
logic from phi, then p |- psi.
One comment: in the absence of choice (as in NF), there may be no name
which witnesses an existential statement true at a condition p --
however, in such a case there will be a dense set of conditions
stronger than p at which there is a name witnessing the statement.
This appears to be harmless. It does not affect the validity of
comprehension axioms, nor does it affect the validity of classical logic
in the forcing interpretation.
Another comment: The "truth values" of unstratified statements
may be proper subclasses of P. That is, for some unstratified sentence
phi, the class of all p such that p |- phi may not be a set. The Axiom
of Small Ordinals would preclude this, but the phenomenon does not seem
to be harmful. It does mean that I feel freer to work with P instead
of the Boolean algebra RO(P), as is mathematically more convenient
in forcing constructions. Stratified sentences will have "truth values"
in RO(P).
Weak extensionality holds in the forcing interpretation; this should
be clear from the definitions. To see that comprehension holds, let
phi be stratified. A name for the set of x such that phi is the
set of all pairs (a,A) such that A is the set of all conditions p
such that p |- phi[a/x]. A exists because the stratification conditions
for p |- phi[a/x] are the same as those for phi; it belongs to RO(P)
by considerations outlined above.
I've worked my way through the standard proof of the independence of
CH, and it all goes as usual. Of course, the construction of
analogies between ordinals and cardinals in the underlying model and
ordinals and cardinals in the forcing interpretation and the statement
of assertions such as "ccc forcing relations preserve cardinals" based
on these analogies look a little different (but not much different).
I got this far in Cambridge, though I looked closely at the details
of the standard forcing proof for CH only recently.
Forcing in NF
This forcing construction, if applied in NF, will create urelements
(there will be names which do not respect equality). The big surprise
is that on close examination it turns out that the model of NFU
obtained after forcing on P has no more urelements than sets, and so
can be converted back into a model of NF (using "permutation
techniques" -- though the map used is not really a permutation! The
theorem, due to Boffa, is that if one has a bijection f from V onto
P(V) in a model of NFU one can redefine membership: x E_new y is
defined as x E f(y), and this will give a model of NF). Statements
whose independence is proved by forcing arguments are invariant under
"permutation techniques" (even when generalized), so we have forcing
for NF as well, which is a breakthrough!
The old approach to forcing could not do this -- it was clear that
there were many more urelements than sets in models built by forcing
on isomorphism types of well-founded extensional relations in NFU or
NF.
It is sufficient to show that there is an injection from the universe
into the set of sets in the forcing interpretation in NF. The rest follows
by Schroder-Bernstein.
The construction is best explained by returning to the idea that under
condition p, a name A intuitively has as its "elements" the objects x
such that p is an element of A[x]. We had to abandon this intuition because
it didn't respect equality (for some names A, p might force x = y
and we might have p in A[x] but not in A[y]).
We can recover this by exhibiting objects char(x) such that for all
conditions p, p |- char(x) = char(y) just in case x = y in the real
world; we then (speaking intuitively again) replace the "elements" of
each name A with their images under char to obtain a name of a set
uniquely determined by the general name A.
char(x) is defined as the function which takes all elements of x to P
(which is an element of RO(P)) and all elements of y to the empty set
(which is also an element of RO(P)). (Note that char(x) is at the
same relative type as x; this depends on the use of a type-level
ordered pair.)
The condition on char given above is easily seen to hold. It does not
hold in NFU: p |- char(u) = char(v) for any urelements u and v.
Now the idea is to turn an arbitrary name A into a set by replacing all
of its intuitive "elements" under all conditions with their images under
char, which will eliminate any conflicts with the notion of equality
in the forcing interpretation, because equality between char(x) and char(y)
under any condition coincides with equality between char(x) and char(y)
in the real world.
We define A' as the set of all (char(x),a) such that A[x] = a.
A' is not a set: some objects y which are not images under char
are equal to some char(x) under some conditions, and these need to be
added:
for any name B define B* as the function taking each x to the set of
all conditions p such that for each condition q>=p, there is a
condition r>=q such that there is a y such that r |- x=y and r belongs
to B[y].
This is a general operation for fattening a name so that it respects
the equality of the forcing interpretation.
We take (A')* as the image of A under the injection we are trying to
construct. It is clear that (A')* is a set. We need to establish that
this is really a function (respects equality) and that it is an injection.
It is easy to see that p |- "A=B iff A'=B'" because A' is constructed from
A by replacing the "elements" of A with objects with precisely parallel
identity conditions. To see that p |- "A'=B' iff (A')* = (B')*",
observe that nothing is an element of (A')* under a condition p without
being equal to a char(x) under that condition, and identity conditions
on char(x)'s are the same under all conditions: it is safe to ignore
the stuff added in the fattening process.
If one forces on NF + AxCount with the extension (not inclusion)
partial order on well-orderings of proper subsets of the reals, one
obtains a model of NFU + AxCount + "the continuum can be well-ordered"
(the details require checking, but this is not hard -- POSSIBLE GAP --
I do _not_ see that this works; if aleph(c) has countable cofinality
things might not work!). Then a permutation can be used to eliminate
the urelements (because there are no more urelements than sets) and
the process of constructing the permutation model will not affect the
truth of the invariant sentence "the continuum can be well-ordered".
This is a brand-new result which no previously available techniques
could touch.
More comments:
The forcing technique in NFU is _easier_ than the standard techniques
in ZFC (if one understands how to work in NFU, which is a big "if"!),
because it is not necessary to iterate the construction of names. At
least, that's my immediate perception. Quite the reverse was true of
the old technique with the detour through equivalence classes of
well-founded extensional relations: it was far nastier than the
technique in ZFC and obviously parasitic on it.
I have developed a "native" technique for symmetry arguments in NFU
(analogous to proofs of the independence of choice from ZFA). This is
much more technical, and I haven't tried putting it together with the
forcing.
Is there a nice way to formulate forcing in NF so that no urelements are
ever created?
This technique is "boolean valued"; I don't make any effort to
collapse the space of conditions or "possible worlds". In NFU +
Choice, one could use an ultrafilter to collapse the "multi-valued"
model to a 2-valued model; I don't see that this is really all that
helpful, as such models are very likely to be pathological in some way.
In NF, we can show consistency of enough choice to allow collapse of
any given forcing interpretation to a 2-valued interpretation. I
don't see that the machinery of countable transitive models and
generic ultrafilters is natural to use in the NF context: it isn't
nearly as natural to think of models of NFU inside the world of NFU as
it is to think of models of ZFC inside the world of ZFC.
It would be nice to have separation for s.c. sets. I asked Solovay
how strong NFU + AxCount + "every subclass of an s.c. set is a set"
might be. This would ensure that the "truth values" of all sentences
were represented by sets. It also gives such things as full math
induction. Is it as strong as ZFC - power set + weakly compact? Of
course, the Axiom of Small Ordinals works for this, but it would be
interesting if s.c. separation were weaker.
A possible problem:
This has nothing to do with the forcing technology -- I'm not sure how
to force the well-ordering of the reals in the absence of choice. I have
heard rumors that this has been done, but I haven't got the details right.