PFIF 1.4 examples:
http://zesty.ca/pfif/1.4/examples.html
FAQ and other information on PFIF:
http://zesty.ca/pfif
URL of this specification:
http://zesty.ca/pfif/1.4
This document is licensed under the GNU Free Documentation License 1.2.
This document defines the People Finder Interchange Format, which consists of a data model and an XML-based exchange format for sharing data about people who are missing or displaced by natural or human-made disasters. The data model is first described in a manner independent of implementation style (object-oriented, relational, or XML), then the PFIF XML format is specified by an RELAX NG schema. This document also offers an example of a possible relational database schema for PFIF data.
Each PFIF repository may contain original records and clone records. An original record is a record residing in its original repository; a clone record is a copy of a record that originated in another repository. The following diagram describes the life of a PFIF record as it is created and then travels to other repositories.
.----------------------.
| 1. real-world facts |
'----------------------'
| |
entered by a human | | entered by a human
into a PFIF repository | | into a non-PFIF repository
| |
entry_date, source_date, | |
source_name, source_url | |
are set by the repository | |
v v
.------------------------------. .---------------------------------.
| 2a. original PFIF record in | | 2b. original non-PFIF record in |
| record's original repository | | record's original repository |
'------------------------------' '---------------------------------'
| |
exported as a PFIF | | parsed and converted to the PFIF
document or feed | | data model by a human or program
| |
| | source_date, source_name, source_url
| | are set by the human or program
v v
.-----------------.
.--------------> | 3. PFIF record |
| '-----------------'
| |
| | loaded into a PFIF repository
| |
| | entry_date is set to date/time of import
| v
| .--------------------------------------.
| | 4. clone record in a PFIF repository |
| '--------------------------------------'
| |
| | exported as a PFIF document or feed
| |
'------------------------'
Whenever a PFIF repository adds a new original record or clone record, it must set the entry_date field to the current time. This time value must never decrease as records are added. A client can incrementally update its copy of a repository by querying for all records with an entry_date greater than or equal to the entry_date of the last received record.
The original repository for a record (2a or 2b in the diagram above) can update any of the fields on a record after it is created, except the person_record_id field. Whenever a PFIF repository creates or updates an original record, it must set both the source_date and entry_date fields to the current time. When a repository imports a PFIF record that has the same record identifier as an existing record, it should keep the version with the latest source_date.
If present, the expiry_date field indicates when a record should be deleted to preserve the privacy of the personal information it contains. Conforming PFIF implementations must meet the following requirements:
To satisfy a user request to delete an existing original record, a PFIF repository should set the record's expiry_date to the current time. (In accordance with the preceding section, it would also set the source_date and entry_date to the current time.) The expiry mechanism described above would then cause the deletion to propagate to other conforming PFIF repositories.
There are two types of records. person records are for information that identifies a person. note records are for information about the current status of a person. Each note record belongs to a particular person, and a person record may have with any number of associated note records.
person records may be created both by those who seek missing a person and by those who have information on a missing person. The person record for a person is the point of convergence for all parties; the note records on that person are the growing pool of shared knowledge.
A person record should only be updated if the information in the record is incorrect. If the status or location of a particular person has changed, this should be indicated by adding a new note record associated with that person record.
A person record contains 25 fields. There may be multiple person records for the same person. In fact, any given application that imports data from multiple sources is likely to acquire multiple person records for the same person. It is up to the application to associate such records (see Suggested relational database schema below). It is recommended that applications keep copies of all the records, and separately keep track of which records correspond to the same person.
This metadata is necessary to enable users of the data to trace and ascertain its reliability.
These fields contain information that is used to identify a person; this is information that is not expected to change unless it is incorrect. Searches for person records should search over these fields.
female, male, or other.
If the sex is unknown, omit this field.
Each note record belongs to exactly one person record. There may be any number of note records associated with a particular person record. (See below for implementation notes. A database might implement this by including a foreign key, person_record_id, that refers to the person record. An object-oriented representation might implement this by embedding a list of note objects within the person object.)
note records are used to provide updated, current information on a missing person. Every note has a timestamp and information on the author of the note. Applications can use the timestamp to determine the most recent value of a given field. Users can use the author information to ascertain the reliabiliy of a given field.
The author_made_contact, status, email_of_found_person, phone_of_found_person and last_known_location fields store data that changes over time. When these fields are present in a note record, the record is specifying new values for these fields, and the source_date field indicates the date that the new values took effect. So, for example, an application that wants to display the most recent known location can look for the note with the latest source_date that has a non-empty last_known_location field.
true
if the author of this note has personally contacted the missing person,
or false otherwise.
If this field is true,
the text field of this note should
describe HOW and WHEN the person was contacted or seen.
information_soughtis_note_authorbelieved_alivebelieved_missingbelieved_deadThe XML Namespace for PFIF is:
The MIME type for a PFIF document is:
application/pfif+xmlA valid PFIF XML document consists of a single pfif element containing one or more person or note elements, each of which contains child elements for the fields described above. In a person element, the person_record_id, source_date, and full_name fields are mandatory. In a note element, the note_record_id, author_name, and source_date fields are mandatory. All other fields are optional. The order of the child elements within a person or note element is not significant.
A note element can exist inside or outside a person element. When a note element appears outside a person element, the note must contain a person_record_id. Otherwise, the person_record_id field is optional, and if present, must match the person_record_id of the enclosing person.
The RELAX NG Schema for PFIF, given in RELAX NG Compact Syntax, is as follows:
namespace pfif = "http://zesty.ca/pfif/1.4"
start = element pfif:pfif { person* & note* }
person = element pfif:person {
element pfif:person_record_id { record_id } &
element pfif:entry_date { time } ? &
element pfif:expiry_date { time } ? &
element pfif:author_name { text } ? &
element pfif:author_email { email } ? &
element pfif:author_phone { phone } ? &
element pfif:source_name { text } ? &
element pfif:source_date { time } &
element pfif:source_url { url } ? &
element pfif:full_name { text } &
element pfif:given_name { text } ? &
element pfif:family_name { text } ? &
element pfif:alternate_names { text } ? &
element pfif:description { text } ? &
element pfif:sex { sex } ? &
element pfif:date_of_birth { approx_date } ? &
element pfif:age { approx_age } ? &
element pfif:home_street { text } ? &
element pfif:home_neighborhood { text } ? &
element pfif:home_city { text } ? &
element pfif:home_state { text } ? &
element pfif:home_postal_code { text } ? &
element pfif:home_country { country_code } ? &
element pfif:photo_url { url } ? &
element pfif:profile_urls { text } ? &
note*
}
note = element pfif:note {
element pfif:note_record_id { record_id } &
element pfif:person_record_id { record_id } ? &
element pfif:linked_person_record_id { record_id } ? &
element pfif:entry_date { time } ? &
element pfif:author_name { text } &
element pfif:author_email { email } ? &
element pfif:author_phone { phone } ? &
element pfif:source_date { time } &
element pfif:author_made_contact { boolean } ? &
element pfif:status { status } ? &
element pfif:email_of_found_person { email } ? &
element pfif:phone_of_found_person { phone } ? &
element pfif:last_known_location { text } ? &
element pfif:text { text } &
element pfif:photo_url { url } ?
}
record_id = xsd:string { pattern = ".+/.+" }
time = xsd:dateTime { pattern = "\d\d\d\d-\d\d-\d\dT\d\d:\d\d:\d\d(\.\d+)?Z" }
email = xsd:string { pattern = ".+@.+" }
phone = xsd:string { pattern = "[\-+()\d ]+" }
url = text
sex = "female" | "male" | "other"
approx_date = xsd:string { pattern = "\d\d\d\d(-\d\d(-\d\d)?)?" }
approx_age = xsd:string { pattern = "\d+(-\d+)?" }
country_code = xsd:string { pattern = "[A-Z][A-Z]" }
boolean = "true" | "false"
status = "information_sought" | "is_note_author" |
"believed_alive" | "believed_missing" | "believed_dead"
PFIF XML documents can be embedded into Atom 1.0 feeds. The PFIF document should be embedded using an XML namespace and inserted as an immediate child of the entry element.
Atom 1.0 defines a top-level feed element
that contains any number
of entry elements.
The top-level element should declare the PFIF namespace.
The recommended prefix is pfif,
so the top-level element should look like this:
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:pfif="http://zesty.ca/pfif/1.4">
...
</feed>
The rest of this section offers recommendations on how applications should populate the standard Atom elements so that the feed will make sense to existing feed-reading software. Nonetheless, the embedded PFIF document takes precedence over any redundant information that appears in Atom elements.
Two kinds of PFIF Atom feeds are defined here: person feeds in which each entry contains a person, and note feeds in which each entry contains a note. A person feed is roughly analogous to a blog feed containing blog entries; a note feed is roughly analogous to a comment feed on a particular blog entry. For example, one application might subscribe to a person feed in order to aggregate missing person records from other databases; another application might subscribe to a note feed in order to display a stream of notes with updates about a particular person.
An Atom person feed provides at least the following elements within the feed element:
rel attribute whose value
is self.
An Atom person feed provides at least the following elements within each entry element:
An Atom note feed provides at least the following elements within the feed element:
rel attribute whose value
is self.
An Atom note feed provides at least the following elements within each entry element:
PFIF XML documents can be embedded into RSS 2.0 feeds. (In RSS 2.0 terminology, this section defines an RSS 2.0 module.) The PFIF document should be specified using an XML namespace and embedded as an immediate child of the item element.
RSS 2.0 defines two main elements,
channel and item,
that are enclosed in a top-level rss element.
The top-level element should declare the PFIF namespace.
The recommended prefix is pfif,
so the top-level element should look like this:
<rss version="2.0" xmlns:pfif="http://zesty.ca/pfif/1.4"> ... </rss>
The rest of this section offers recommendations on how applications should populate the standard RSS elements so that the feed will make sense to existing feed-reading software. Nonetheless, the embedded PFIF document takes precedence over any redundant information that appears in RSS elements.
As in the preceding section, two kinds of PFIF RSS feeds are defined here: person feeds in which each item contains a person, and note feeds in which each item contains a note.
An RSS person feed provides at least the following elements within the channel element:
An RSS person feed provides at least the following elements within each item element:
An RSS note feed provides at least the following elements within the channel element:
An RSS note feed provides at least the following elements within each item element:
This section suggests a possible relational database schema for storing PFIF data. The exact details of a database design are up to each application; this is one possible starting point. A relational database could store PFIF records in two tables, person and note, for the two types of records.
PERSON table:
string person_record_id primary key
datetime entry_date
datetime expiry_date
string author_name
string author_email
string author_phone
string source_name
datetime source_date
string source_url
string full_name
string given_name
string family_name
string alternate_names
text description
string sex
string date_of_birth
string age
string home_street
string home_neighborhood
string home_city
string home_state
string home_postal_code
string home_country
string photo_url
string profile_urls
NOTE table:
string note_record_id primary key
string person_record_id foreign key not null
string linked_person_record_id foreign key or null
datetime entry_date
string author_name
string author_email
string author_phone
datetime source_date
boolean author_made_contact
string status
string email_of_found_person
string phone_of_found_person
string last_known_location
text text
string photo_url
To link a foreign person record with a local person record, the application adds a note associated with the local person record, with a linked_person_record_id field containing the person_record_id of the foreign record. The other fields of the note describe the circumstances of the decision to merge: source_date indicates the date of the decision, text gives the reason for the decision, and author_name names the person, program, or other entity that made the decision. This specification does not dictate how an application would decide whether to merge two records; a merge could be initiated by a human operator or by a software algorithm that look for records with similar data. Recording the merge decision in a note record makes it possible to back out of a bad merge decision, and recording the name of the person or program in the author_name field makes it possible to track down the cause of an incorrect merge.
When displaying a person record, the application can then look for all the non-empty linked_person_record_id fields among the notes that belong to that person record, and display all the linked records or a merged view of the linked records.
person records gained four new fields:
sex,
date_of_birth,
age, and
home_country.
The home_zip field
was replaced with home_postal_code.
To upgrade from a PFIF 1.1 repository,
export the old home_zip values in the
home_postal_code field
and set the home_country field
to US in records
whose home_state refers to a U. S. state
or home_postal_code field contains a U. S. zip code.
note records gained three new fields: person_record_id, linked_person_record_id, and status.
In the PFIF XML format, note elements became allowed outside of person elements. Aside from the note_record_id and person_record_id fields, which had to appear first, the rest of the child elements became permissible in any order.
Atom entries and RSS items came to contain individual pfif:person and pfif:note elements with no enclosing pfif:pfif element.
The source_date field became mandatory on person records. Records can be updated by (and only by) their original repository, and the source_date must be updated when a record changes.
person records gained the mandatory full_name field; first_name and last_name became optional.
person records gained the new expiry_date field, with conformance requirements for data deletion and propagation of the expiry date.
In the PFIF XML format, all the child elements of person elements and note elements became permissible in any order.
person records gained the optional alternate_names field and the optional profile_urls field.
In person records, the first_name field was renamed to given_name and the last_name field was renamed to family_name.
In person records, the description field replaced the old other field.
note records gained the photo_url field.
In note records, the found field was renamed to author_made_contact.
In note records, there is now a convention for specifying geographic coordinates in the existing last_known_location field.
The initial data model on which the first version of PFIF was based is due to the CiviCRM team, David Geilhufe, and Kieran Lal. Luke Blanshard, Tony Chang, Josh Kleinpeter, Kieran Lal, Jonathan Plax, Gabe Wachob, Ka-Ping Yee, Steve Hakusa, Mark Prutsalis, Lee Schumacher, the Missing Persons Community of Interest (tci_missingpersons@googlegroups.com), and other participants on the working group list (pfif@googlegroups.com) contributed to the current design of PFIF.