URL of this specification:
http://zesty.ca/pfif/1.3
FAQ, examples, and other information on PFIF:
http://zesty.ca/pfif
This document is licensed under the GNU Free Documentation License 1.2.
This document defines the People Finder Interchange Format, which consists of a data model and an XML-based exchange format for sharing data about people who are missing or displaced by natural or human-made disasters. The data model is first described in a manner independent of implementation style (object-oriented, relational, or XML), then the PFIF XML format is specified by an RELAX NG schema. This document also offers an example of a possible relational database schema for PFIF data.
Each PFIF repository may contain original records and clone records. An original record is a record residing in its original repository; a clone record is a copy of a record that originated in another repository. The following diagram describes the life of a PFIF record as it is created and then travels to other repositories.
.----------------------. | 1. real-world facts | '----------------------' | | entered by a human | | entered by a human into a PFIF repository | | into a non-PFIF repository | | entry_date, source_date, | | source_name, source_url | | are set by the repository | | v v .------------------------------. .---------------------------------. | 2a. original PFIF record in | | 2b. original non-PFIF record in | | record's original repository | | record's original repository | '------------------------------' '---------------------------------' | | exported as a PFIF | | parsed and converted to the PFIF document or feed | | data model by a human or program | | | | source_date, source_name, source_url | | are set by the human or program v v .-----------------. .--------------> | 3. PFIF record | | '-----------------' | | | | loaded into a PFIF repository | | | | entry_date is set to date/time of import | v | .--------------------------------------. | | 4. clone record in a PFIF repository | | '--------------------------------------' | | | | exported as a PFIF document or feed | | '------------------------'
Whenever a PFIF repository adds a new original record or clone record, it must set the entry_date field to the current time. This time value must never decrease as records are added. A client can incrementally update its copy of a repository by querying for all records with an entry_date greater than or equal to the entry_date of the last received record.
The original repository for a record (2a or 2b in the diagram above) can update any of the fields on a record after it is created, except the person_record_id field. Whenever a PFIF repository creates or updates an original record, it must set both the source_date and entry_date fields to the current time. When a repository imports a PFIF record that has the same record identifier as an existing record, it should keep the version with the latest source_date.
If present, the expiry_date field indicates when a record should be deleted to preserve the privacy of the personal information it contains. Conforming PFIF implementations must meet the following requirements:
To satisfy a user request to delete an existing original record, a PFIF repository should set the record's expiry_date to the current time. (In accordance with the preceding section, it would also set the source_date and entry_date to the current time.) The expiry mechanism described above would then cause the deletion to propagate to other conforming PFIF repositories.
There are two types of records. person records are for information that identifies a person. note records are for information about the current status of a person. Each note record belongs to a particular person, and a person record may have with any number of associated note records.
person records may be created both by those who seek missing a person and by those who have information on a missing person. The person record for a person is the point of convergence for all parties; the note records on that person are the growing pool of shared knowledge.
A person record should only be updated if the information in the record is incorrect. If the status or location of a particular person has changed, this should be indicated by adding a new note record associated with that person record.
A person record contains 23 fields. There may be multiple person records for the same person. In fact, any given application that imports data from multiple sources is likely to acquire multiple person records for the same person. It is up to the application to associate such records (see Suggested relational database schema below). It is recommended that applications keep copies of all the records, and separately keep track of which records correspond to the same person.
This metadata is necessary to enable users of the data to trace and ascertain its reliability.
These fields contain information that is used to identify a person; this is information that is not expected to change unless it is incorrect. Searches for person records should search over these fields.
The other field is a very crude way to import foreign data; the formatting guidelines are intended to enable extraction of the foreign data if necessary. For other, free-form text was chosen instead of XML to make it easy for an application to display the other field directly in the UI.
female
, male
, or other
.
If the sex is unknown, omit this field.
US
when exporting records
whose home_state refers to a U. S. state
or home_zip field contains a U. S. zip code.
description: Dark hair, in her late thirties. Also goes by the names "Kate" or "Katie". automated-pfif-author: ScrapeMatic 0.5Field names for data fields imported from other applications should begin with a domain name and a slash, where the domain name identifies the entity that defined the field. For example, if example-format.org defines a missing persons format that contains a "birth_city" field, it would be imported into the PFIF other field like this:
example-format.org/birth_city: London, UK
Each note record belongs to exactly one person record. There may be any number of note records associated with a particular person record. (See below for implementation notes. A database might implement this by including a foreign key, person_record_id, that refers to the person record. An object-oriented representation might implement this by embedding a list of note objects within the person object.)
note records are used to provide updated, current information on a missing person. Every note has a timestamp and information on the author of the note. Applications can use the timestamp to determine the most recent value of a given field. Users can use the author information to ascertain the reliabiliy of a given field.
The found, status, email_of_found_person, phone_of_found_person and last_known_location fields store data that changes over time. When these fields are present in a note record, the record is specifying new values for these fields, and the source_date field indicates the date that the new values took effect. So, for example, an application that wants to display the most recent known location can look for the note with the latest source_date that has a non-empty last_known_location field.
true
if the missing person has been personally contacted or seen
by the author of this note, or false
otherwise.
If this field is true
,
the text field of this note should
describe HOW and WHEN the person was contacted or seen.
information_sought
is_note_author
believed_alive
believed_missing
believed_dead
The XML Namespace for PFIF is:
The MIME type for a PFIF document is:
application/pfif+xml
A valid PFIF XML document consists
of a single pfif
element
containing one or more
person
or note
elements, each of which contains child elements for the fields described above.
In a person
element,
the person_record_id,
source_date, and
full_name fields are mandatory.
In a note
element,
the note_record_id,
author_name,
source_date, and
text fields are mandatory.
All other fields are optional.
The order of the child elements
within a person
or
note
element is not significant.
A note
element
can exist inside or outside a person
element.
When a note
element appears
outside a person
element,
the note
must contain a person_record_id.
Otherwise, the person_record_id field is optional,
and if present, must match the person_record_id
of the enclosing person
.
The RELAX NG Schema for PFIF, given in RELAX NG Compact Syntax, is as follows:
namespace pfif = "http://zesty.ca/pfif/1.3" start = element pfif:pfif { person* & note* } person = element pfif:person { element pfif:person_record_id { record_id } & element pfif:entry_date { time } ? & element pfif:expiry_date { time } ? & element pfif:author_name { text } ? & element pfif:author_email { email } ? & element pfif:author_phone { phone } ? & element pfif:source_name { text } ? & element pfif:source_date { time } & element pfif:source_url { url } ? & element pfif:full_name { text } & element pfif:first_name { text } ? & element pfif:last_name { text } ? & element pfif:sex { sex } ? & element pfif:date_of_birth { approx_date } ? & element pfif:age { approx_age } ? & element pfif:home_street { text } ? & element pfif:home_neighborhood { text } ? & element pfif:home_city { text } ? & element pfif:home_state { text } ? & element pfif:home_postal_code { text } ? & element pfif:home_country { country_code } ? & element pfif:photo_url { url } ? & element pfif:other { text } ? & note* } note = element pfif:note { element pfif:note_record_id { record_id } & element pfif:person_record_id { record_id } ? & element pfif:linked_person_record_id { record_id } ? & element pfif:entry_date { time } ? & element pfif:author_name { text } & element pfif:author_email { email } ? & element pfif:author_phone { phone } ? & element pfif:source_date { time } & element pfif:found { boolean } ? & element pfif:status { status } ? & element pfif:email_of_found_person { email } ? & element pfif:phone_of_found_person { phone } ? & element pfif:last_known_location { text } ? & element pfif:text { text } } record_id = xsd:string { pattern = ".+/.+" } time = xsd:dateTime { pattern = "\d\d\d\d-\d\d-\d\dT\d\d:\d\d:\d\d(\.\d+)?Z" } email = xsd:string { pattern = ".+@.+" } phone = xsd:string { pattern = "[\-+()\d ]+" } url = text sex = "female" | "male" | "other" approx_date = xsd:string { pattern = "\d\d\d\d(-\d\d(-\d\d)?)?" } approx_age = xsd:string { pattern = "\d+(-\d+)?" } country_code = xsd:string { pattern = "[A-Z][A-Z]" } boolean = "true" | "false" status = "information_sought" | "is_note_author" | "believed_alive" | "believed_missing" | "believed_dead"
PFIF XML documents can be embedded into
Atom 1.0 feeds.
The PFIF document should be embedded using an XML namespace
and inserted as an immediate child
of the entry
element.
Atom 1.0 defines a top-level feed
element
that contains any number
of entry
elements.
The top-level element should declare the PFIF namespace.
The recommended prefix is pfif
,
so the top-level element should look like this:
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:pfif="http://zesty.ca/pfif/1.3"> ... </feed>
The rest of this section offers recommendations on how applications should populate the standard Atom elements so that the feed will make sense to existing feed-reading software. Nonetheless, the embedded PFIF document takes precedence over any redundant information that appears in Atom elements.
Two kinds of PFIF Atom feeds are defined here: person feeds in which each entry contains a person, and note feeds in which each entry contains a note. A person feed is roughly analogous to a blog feed containing blog entries; a note feed is roughly analogous to a comment feed on a particular blog entry. For example, one application might subscribe to a person feed in order to aggregate missing person records from other databases; another application might subscribe to a note feed in order to display a stream of notes with updates about a particular person.
An Atom person feed provides at least the following elements
within the feed
element:
id
title
subtitle
updated
link
rel
attribute whose value
is self
.
An Atom person feed provides at least the following elements
within each entry
element:
pfif:person
pfif:note
elements.
A service wishing to provide a complete export
would include all the note records
associated with the person here.
id
title
author
name
element containing the value of the
author_name field and an
email
element containing the value of the
author_email field
in the person record.
updated
content
source
title
element of this feed.
This element may also contain copies of any
other child elements of the feed element.
An Atom note feed provides at least the following elements
within the feed
element:
id
title
subtitle
updated
link
rel
attribute whose value
is self
.
An Atom note feed provides at least the following elements
within each entry
element:
pfif:note
id
title
author
name
element
containing the value of the author_name field
and an email
element
containing the value of the author_email field
in the note record.
updated
content
PFIF XML documents can be embedded into
RSS 2.0 feeds.
(In RSS 2.0 terminology, this section defines an RSS 2.0 module.)
The PFIF document should be specified using an XML namespace
and embedded as an immediate child
of the item
element.
RSS 2.0 defines two main elements,
channel
and item
,
that are enclosed in a top-level rss
element.
The top-level element should declare the PFIF namespace.
The recommended prefix is pfif
,
so the top-level element should look like this:
<rss version="2.0" xmlns:pfif="http://zesty.ca/pfif/1.3"> ... </rss>
The rest of this section offers recommendations on how applications should populate the standard RSS elements so that the feed will make sense to existing feed-reading software. Nonetheless, the embedded PFIF document takes precedence over any redundant information that appears in RSS elements.
As in the preceding section, two kinds of PFIF RSS feeds are defined here: person feeds in which each item contains a person, and note feeds in which each item contains a note.
An RSS person feed provides at least the following elements
within the channel
element:
title
description
lastBuildDate
link
An RSS person feed provides at least the following elements
within each item
element:
pfif:person
pfif:note
elements.
A service wishing to provide a complete export
would include all the note records
associated with the person here.
guid
title
author
pubDate
description
source
link
An RSS note feed provides at least the following elements
within the channel
element:
title
description
lastBuildDate
link
An RSS note feed provides at least the following elements
within each item
element:
pfif:note
guid
author
pubDate
description
This section suggests a possible relational database schema for storing PFIF data. The exact details of a database design are up to each application; this is one possible starting point. A relational database could store PFIF records in two tables, person and note, for the two types of records.
PERSON table: string person_record_id primary key datetime entry_date datetime expiry_date string author_name string author_email string author_phone string source_name datetime source_date string source_url string full_name string first_name string last_name string sex string date_of_birth string age string home_street string home_neighborhood string home_city string home_state string home_postal_code string photo_url text other NOTE table: string note_record_id primary key string person_record_id foreign key not null string linked_person_record_id foreign key or null datetime entry_date string author_name string author_email string author_phone datetime source_date boolean found string status string email_of_found_person string phone_of_found_person string last_known_location text text
To link a foreign person record with a local person record, the application adds a note associated with the local person record, with a linked_person_record_id field containing the person_record_id of the foreign record. The other fields of the note describe the circumstances of the decision to merge: source_date indicates the date of the decision, text gives the reason for the decision, and author_name names the person, program, or other entity that made the decision. This specification does not dictate how an application would decide whether to merge two records; a merge could be initiated by a human operator or by a software algorithm that look for records with similar data. Recording the merge decision in a note record makes it possible to back out of a bad merge decision, and recording the name of the person or program in the author_name field makes it possible to track down the cause of an incorrect merge.
When displaying a person record, the application can then look for all the non-empty linked_person_record_id fields among the notes that belong to that person record, and display all the linked records or a merged view of the linked records.
person records gained four new fields:
sex
,
date_of_birth
,
age
, and
home_country
.
The home_zip
field
was replaced with home_postal_code
.
note records gained three new fields:
person_record_id
,
linked_person_record_id
, and
status
.
In the PFIF XML format,
note
elements became allowed
outside of person
elements.
Aside from the
note_record_id
and
person_record_id
fields,
which had to appear first,
the rest of the child elements became permissible in any order.
Atom entries and RSS items came to contain
individual pfif:person
and
pfif:note
elements
with no enclosing pfif:pfif
element.
The source_date
field became mandatory
on person records.
Records can be updated by (and only by) their original repository,
and the source_date
must be updated
when a record changes.
person records
gained the mandatory full_name
field;
first_name
and
last_name
became optional.
person records
gained the new expiry_date
field,
with conformance requirements for data deletion
and propagation of the expiry date.
In the PFIF XML format,
all the child elements of
person
elements and
note
elements
became permissible in any order.
The initial data model on which the first version of PFIF was based is due to the CiviCRM team, David Geilhufe, and Kieran Lal. Luke Blanshard, Tony Chang, Josh Kleinpeter, Kieran Lal, Jonathan Plax, Gabe Wachob, Ka-Ping Yee, Steve Hakusa, Mark Prutsalis, Lee Schumacher, and other participants on the working group list (pfif
@googlegroups.com) contributed to the current design of PFIF.