[rdfweb-dev] partly anonymous web communities

Sun Jul 6 13:39:07 UTC 2003

> For me, the difference is that if you I use the real mailbox to
> generate the hash, I can verifiy that that given profile actually is
> the person I think it is by information he/she has given. You're right,
> I can only identify people if I now the possible mailboxes before, but
> then I can be sure about the identity. I think this conflicts with the
> notion that the community is by default anonymous.

I'm not sure we're disagreeing here.  A foaf:person without a cleartext e-mail
address, using a hashed one, is useful only to someone that already knows their
e-mail address.  Thus if a 'new person' wants to see if people they know are
'already in foafspace' they can hash up the address and search for it as a key.

> Of course, with FOAF anybody can put the information that two mailboxes
> (real or pseudo) belong to the same person into foafspace, thus also
> destroying the anonymity.

Sure, if someone naively exposing the same cleartext addresses within the
context of the other users' foaf URI then yes, they're make the social faux pas
of outting the other person's address.

> The crucial difference for me, is that this
> is just some dude claiming the relation (anyone can claim anything in
> FOAF) vs the first version (with the real mailbox as generator for the
> hash) would be a verifiable claim; which, of course, if usually a good
> thing, except here.

I don't think there's any reliable way to "prevent" someone from making
relationship information.  This is true in the real world and it's certainly
going to be just as much of a problem in the foafspace.  The same ways the real
world says "that guy doesn't know me" is just as applicable here.  Thus my past
questions of *proving" foaf provencance.

> Or stated otherwise: In this way, introducing FOAF wouldn't change the
> anonymity of the community, as it was already possible that someone
> puts up a webpage and says "hey that person in this community is
> actually the same as that one here". But this is just a unverifiable
> claim, and as long as I don't use the real mailbox, nothing else is
> possible in FOAF, too.

Indeed.

> Well, adding a relaying service would be no problem, but IMHO this
> wouldn't fix the problem in its original spirit. I mean, nobody will
> actually use this relaying service for real if I wouldn't communicate
> it enough (which in many communities doesn't make much sense). So while
> the basis for the hash would then be something that can actually get
> email, it wouldn't be a really used email address. The primary way to
> get in touch with the user is given through foaf:homePage.

Sure, if people know to use it.  It's the same argument for the hashed mail
address.  They have to know to search on that key to find the person.  The
question is how well existing foafspace tools are at dealing with searches by
info other than the e-mail address (as a hash).

> This sounds interesting. Let me see, if I get this right: Your goal is
> to preserve the discovery mechanisms in case of incomplete information
> about the person you are looking for. And using a URI scheme instead of
> mailboxes looks cleaner and more widely adopted. Right?

Incomplete only in the sense that it shows a name without an e-mail address.
This name could well be a psuedonym.  Since your community doesn't offer a
native e-mail address and the 'risk' of doing it via hash of their private
address is MUCH too high, this is another way to do it.

The value of hashing the persona URI is that it makes it completely anonymous as
to even the community itself!  That way if a third party came across this
pseudonymous foaf:person and saw that it had a persona URI hash they wouldn't be
able to even tell what community it was representing.  That way only users of
the community that would 'know' to hash the persona URI and then search for it
could make the association.  Now, that outside party, on "discovering" the foaf
persona URI structure could certainly make a hash of it themselves and search
foafspace for it.  This is where the 'hide in plain sight' aspect of a persona
URI hash is not completely private.  Were someone to accidentally put their
hashed private persona URI in their public foaf it would be certainly possible
to cross-reference them.  Thus the critical importance of aiding users in not
making that mistake.

This is little different than not putting a sign up on your house that shouts
out something about you that's not for public consumption.  As in, don't be
stupid.  It's also a near parallel to the 'plain brown wrapper' that might
contain something you received via postal mail.  Again, it's a secret only if
the people involved, who want to share, agree to not be stupid about it.

> OTOH, mailto:<mailbox> is also an URI. And thus, in some sense the
> mbox_sha1sum property works in this URI-subspace, but is otherwise
> essentially a "hash of an URI that uniquely describes this user". So
> something worthwhile could be to add a general mechanism to FOAF which
> says, "All these URIs are associated with this and only this person",
> where some of those URIs might be given as hash only.

In fact, there's nothing that would prevent you from inventing something else
and hashing it.  The barrier being whether tools will know to do searches on it.

> In fact, it seems like some other properties serve this functionality,
> too. This was pointed out by Morten Frederiksen:

Yep, more than one way to skin this cat.

> So, all of these inverse functional properties (most of which are even
> URIs) seem to fall in this category. That is great news, and as
> foaf:homepage isn't something I would want to obfuscate I can start
> right now with it.
>
> The reason I thought that mbox is the primary way to identify people
> is, that I got the impression, that all the tools like foafnaut seem to
> work that way, but I might be wrong? Will these tools handle my foaf
> files if there is neither a foaf:mbox nor a foaf:mbox_sha1sum property?

That one tool doesn't suit your needs shouldn't prevent you from working around
it.  Some tools started with using unhashed addresses.  Most still support using
an unhashed address as the search query.  There has been effort in some of them
to avoid needlessly exposing addresses to the 'public'.

>  From the perspective of a tool writer, who wants to maximize linkage:
> One should check for all inverse functional properties in the used
> namespaces and use them to link. (I'd guess a general inference engine
> would work that way, anyway?) This in turn would make a generalized
> "that hashed URI refers to me" property unnecessary, as usually that
> URI will also have a deeper meaning (like being a mailbox, homepage,
> picture, IM-id, etc.).

I think so but I'm not exactly sure what you're saying here.

-Bill